Business Intelligence in Azure – SQLBits 2018 Precon

What can you expect from my SQLBits pre conference training day in February 2018 at the London Olympia?

Well my friends, in short, we are going to take whirlwind tour of the entire business intelligence stack of services in Azure. No stone will be left unturned. No service will be left without scalability. We’ll cover them all and we certainly aren’t going to check with the Azure bill payer before turning up the compute on our data transforms.

What will we actually cover?

With new cloud services and advancements in locally hosted platforms developing a lambda architecture is becoming the new normal. In this full day of high level training we’ll learn how to architect hybrid business intelligence solutions using Microsoft Azure offerings. We’ll explore the roles of these cloud data services and how to make them work for you in this complete overview of business intelligence on the Microsoft cloud data platform.

Here’s how we’ll break that down during the day…

Module 1 – Getting Started with Azure

Using platform as a service products is great, but let’s take a step back. To kick off we’ll cover the basics for deploying and managing your Azure services. Navigating the Azure portal and building dashboards isn’t always as intuitive as we’d like. What’s a resource group? And why is it important to understand your Azure Activity Directory tenant?

Module 2 – An Overview of BI in Azure

What’s available for the business intelligence architect in the cloud and how might these services relate to traditional on premises ETL and cube data flows. Is ETL enough for our big unstructured data sources or do we need to mix things up and add some more letters to the acronym in the cloud?

Module 3 – Databases in Azure (SQL DB, SQL DW, Cosmos DB, SQL MI)

It’s SQL Server Jim, but not as we know it. Check out the PaaS flavours of our long term on premises friends. Can we trade the agent and an operating system for that sliding bar of scalable compute? DTU and DWU are here to stay with new SLA’s relating to throughput. Who’s on ACID and as BI people do we care?

Module 4 – The Azure Machines are here to Learn

Data scientist or developer? Azure Machine Learning was designed for applied machine learning. Use best-in-class algorithms in a simple drag-and-drop interface. We’ll go from idea to deployment in a matter of clicks. Without a terminator in sight!

Module 5 – Swimming in the Data Lake with U-SQL

Let’s understand the role of this hyper-scale two tier big data technology and how to harness its power with U-SQL, the offspring of T-SQL and C#. We’ll cover everything you need to know to get started developing solutions with Azure Data Lake.

Module 6 – IoT, Event Hubs and Azure Stream Analytics

Real-time data is everywhere. We need to use it and unlock it as a rich source of information that can be channelled to react to events, produce alerts from sensor values or in 9000 other scenarios. In this module, we’ll learn how, using Azure messaging hubs and Azure Stream Analytics.

Module 7 – Power BI, our Sematic Layer, is it All Things to All People?

Combining all our data sources in one place with rich visuals and a flexible data modelling tool. Power BI takes it all, small data, big data, streaming data, website content and more. But we really need a Venn diagram to decide when/where it’s needed.

Module 8 – Data Integration with Azure Data Factory and SSIS

The new integration runtime is here. But how do we unlock the scale out potential of our control flow and data flow? Let’s learn to create the perfect dependency driven pipeline for our data flows. Plus, how to work with the Azure Batch Service should you need that extensibility.


Finally we’ll wrap up the day by playing the Azure icon game, which you’ll all now be familiar with and able to complete with a perfect score having completed this training day 🙂

Many thanks for reading and I hope to see you in February, its going to be magic 😉

Register now: https://www.regonline.com/registration/Checkin.aspx?EventID=2023328

All training day content is subject to change, dependant on timings and the demo gods will!


Community Speaking Analysis with Power BI

Just a short post.

During the SQL Saturday Cambridge speaker meal on Saturday night I was chatting with various people at the dinner table. During this time I got asked the question by an event sponsor “do you speak at many of these things?”. While the general answer I gave at the time was yes. It occurred to me that I didn’t have any accurate means of answering the question with some richer detail. Being a data/business intelligence person this required some serious self reprimanding. I could of referred to my MVP activity log, but that isn’t easily accessible and contains a list of all community involvement. I therefore decided to create a dataset just for speaking within Excel and visualise it with Power BI. Below is what I created.

Nothing to fancy here, just a simple dataset containing 6 fields:

  • Date
  • Event
  • City
  • Postcode (for the map)
  • Attended
  • Subject

What do you think? Click on it to enlarge.

Generally I spend my time architecting data solutions and developing the services to deliver business intelligence to customers. I confess, I don’t spend much time visualising that data and making things look pretty in tools like Power BI, so go easy with the comments.

I of course then published this to PowerBI.com meaning when I get asked the question next time I can simply load the Power BI app on my phone and answer most questions. Being a good developer (I think) I also created a mobile friendly view of the visuals.

If you’d like access to the published version of the dashboard please use this link to request access: https://app.powerbi.com/groups/me/dashboards/91878616-07f4-4f8b-be54-b4df46a2aab7

Many thanks for reading

Connecting PowerBI.com to Azure Data Lake Store – Across Tenants

Welcome readers, this is a post to define a problem that shouldn’t exist. But sadly, does exist and given its relative complexity I think warrants some explanation. Plus, I’ve included details of what you can currently do if you encounter it.

First some background…

Power BI Desktop

With the recent update to the Power BI desktop application we now find the Azure Data Lake Store connector has finally relinquished its ‘(Beta)’ status and is considered GA. This is good news, but doesn’t make any difference to those of us that have already be storing our outputs as files in Data Lake Store.

The connector as before can be supplied with any storage ADL:// URL and set of credentials using the desktop Power BI application. Our local machines are of course external to the concept and context of a Microsoft Cloud tenant and directory. To reiterate, this means any Data Lake Store anywhere can be queried and datasets refreshed using local tools. It doesn’t even matter about Personal vs Work/School accounts.

This hopefully sets the scene for this posts and starts to allude to the problem your likely to encounter if you want to use your developed visuals beyond your own computer.


In this scenario, we’ve developed our Power BI workbook in the desktop product and hit publish. Armed with a valid Office 365/Power BI account the visuals, initial working data, model, measures and connection details for the data source get transferred to the web service version of Power BI, known as PowerBI.com. So far so good.

Next, you want to share and automatically refresh the data, meaning your audience have the latest data at the point of viewing, given a reasonable schedule.

Sharing, no problem at all, assuming you understand the latest Power BI Premium/Free apps, packs, workspace licencing stuff!… A post for another time. Maybe.

Automatic dataset refreshes, not so simple. This expects several ducks to all be lined up exactly. By ducks I mean your Azure Subscription and Office 365 tenant. If they aren’t and one little ducky has strayed from the pack/group/heard (what’s a collection of ducks?). This is want you’ll encounter.

Failed to update data source credentials: The credentials provided for the DataLake source are invalid.

Now this error is also misleading because the problem is not invalid credentials on the face of it. A better error message would say invalid credentials for the tenant of the target data source.

Problem Overview

As most systems and environments evolve its common (given the experience of several customer) to accidentally create a disconnection between your Azure Subscription and your Office 365 environments. This may result in each group of services residing in different directory services or tenants.

In this case the disconnection means you will not be able to authenticate your PowerBI.com datasets against your Azure Data Lake Store allowing for that very important scheduled data refresh.
Coming back to the title of this blog post:

You cannot currently authenticate against an Azure Data Lake Store from PowerBI.com across tenants.

What Do To

Once you’ve finished cursing, considering everything you’ve developed over the last 6 months in your Azure Subscription. Take a breath. Unfortunately, the only long term thing you can do is setup a new Azure Subscription and make dam sure that it’s linked to your Office 365 office and thus residing in the same tenant. Then migrate your Data Lake Store to the new subscription.

Once these ducks are in line the credentials you supply to PowerBI.com for the dataset refresh will be accepted. I promise. I’ve done it.

A short-term work around is to refresh your datasets in the desktop app every day and republish new versions. Very manual. Sorry to be the bearer of bad news.

What Next

Well my friends, I recommend that we strongly petition Microsoft to lift this restriction. I say restriction because it seems like madness. After all the PowerBI.com connector to Azure Data Lake is using OAuth2, so what’s the problem. Furthermore, back in Power BI Desktop land we can connect to any storage with any credentials. We can even create a Power BI workbook joining 2 Data Lake Stores with 2 different sets of credentials (handy if you have a partial data model in production and new outputs in a test environment).

Here is my attempt to get things changed and I’d appreciate your votes.


To conclude, I really want this blog post to get an update soon with some better news given the above. But for now, I hope it helped you understand the potential problem your facing. Or, raises your awareness to a future problem you are likely to encounter.

Many thanks for reading.

Paul’s Frog Blog

This blog is now closed to new comments or posts.

Please see new posts here