Connecting PowerBI.com to Azure Data Lake Store – Across Tenants
Welcome readers, this is a post to define a problem that shouldn’t exist. But sadly, does exist and given its relative complexity I think warrants some explanation. Plus, I’ve included details of what you can currently do if you encounter it.
First some background…
Power BI Desktop
With the recent update to the Power BI desktop application we now find the Azure Data Lake Store connector has finally relinquished its ‘(Beta)’ status and is considered GA. This is good news, but doesn’t make any difference to those of us that have already be storing our outputs as files in Data Lake Store.
The connector as before can be supplied with any storage ADL:// URL and set of credentials using the desktop Power BI application. Our local machines are of course external to the concept and context of a Microsoft Cloud tenant and directory. To reiterate, this means any Data Lake Store anywhere can be queried and datasets refreshed using local tools. It doesn’t even matter about Personal vs Work/School accounts.
This hopefully sets the scene for this posts and starts to allude to the problem your likely to encounter if you want to use your developed visuals beyond your own computer.
In this scenario, we’ve developed our Power BI workbook in the desktop product and hit publish. Armed with a valid Office 365/Power BI account the visuals, initial working data, model, measures and connection details for the data source get transferred to the web service version of Power BI, known as PowerBI.com. So far so good.
Next, you want to share and automatically refresh the data, meaning your audience have the latest data at the point of viewing, given a reasonable schedule.
Sharing, no problem at all, assuming you understand the latest Power BI Premium/Free apps, packs, workspace licencing stuff!… A post for another time. Maybe.
Automatic dataset refreshes, not so simple. This expects several ducks to all be lined up exactly. By ducks I mean your Azure Subscription and Office 365 tenant. If they aren’t and one little ducky has strayed from the pack/group/heard (what’s a collection of ducks?). This is want you’ll encounter.
Failed to update data source credentials: The credentials provided for the DataLake source are invalid.
Now this error is also misleading because the problem is not invalid credentials on the face of it. A better error message would say invalid credentials for the tenant of the target data source.
As most systems and environments evolve its common (given the experience of several customer) to accidentally create a disconnection between your Azure Subscription and your Office 365 environments. This may result in each group of services residing in different directory services or tenants.
In this case the disconnection means you will not be able to authenticate your PowerBI.com datasets against your Azure Data Lake Store allowing for that very important scheduled data refresh.
Coming back to the title of this blog post:
You cannot currently authenticate against an Azure Data Lake Store from PowerBI.com across tenants.
What Do To
Once you’ve finished cursing, considering everything you’ve developed over the last 6 months in your Azure Subscription. Take a breath. Unfortunately, the only long term thing you can do is setup a new Azure Subscription and make dam sure that it’s linked to your Office 365 office and thus residing in the same tenant. Then migrate your Data Lake Store to the new subscription.
Once these ducks are in line the credentials you supply to PowerBI.com for the dataset refresh will be accepted. I promise. I’ve done it.
A short-term work around is to refresh your datasets in the desktop app every day and republish new versions. Very manual. Sorry to be the bearer of bad news.
Well my friends, I recommend that we strongly petition Microsoft to lift this restriction. I say restriction because it seems like madness. After all the PowerBI.com connector to Azure Data Lake is using OAuth2, so what’s the problem. Furthermore, back in Power BI Desktop land we can connect to any storage with any credentials. We can even create a Power BI workbook joining 2 Data Lake Stores with 2 different sets of credentials (handy if you have a partial data model in production and new outputs in a test environment).
Here is my attempt to get things changed and I’d appreciate your votes.
To conclude, I really want this blog post to get an update soon with some better news given the above. But for now, I hope it helped you understand the potential problem your facing. Or, raises your awareness to a future problem you are likely to encounter.
Many thanks for reading.