0845 643 64 63

Analysis Services

1 2 3 5

Azure Storage Backup Retention

This blog is a follow up to a previous blog I wrote about backing up Azure Analysis Services cubes in Azure, that blog can be found here. This blog shows how to implement a retention policy using PowerShell in Azure Runbooks to remove the backups after a set number of days. To create a new Runbook in the Azure portal, go to the relevant Automation account in the relevant resource group and then select Runbooks from the left hand pane. Note you will need to add the Az.Storage module to the automation account to be able to use some of the commands listed in this blog.

The parameters required here are shown below, where ResourceGroupName is the name of the resource group that has the relevant storage account, StorageAccountName is the name of the storage account resource and ContainerName is the name of the container holding the backup files to be removed. BlobName is part of the name of the files and will be used later to obtain only the relevant files from the storage container. RetentionDays specifies the number of days to keep a backup before it is removed.

The command Set-AzCurrentStorageAccount sets the storage account to be used for the subsequent commands, which gets the container using the ContainerName parameter and then lists the relevant files in the blob container using the BlobName parameter.

Once the list of backup files has been obtained, each file can be checked in turn and removed if it meets the relevant criteria. For each file obtained, the LastModified property is used to identify when the file was created. If the resulting date time is older than the number of retention days from today, the Remove-AzStorageBlob command is used to remove that file from the storage container. This check is repeated for all files, removing the relevant files, and leaving only those that are not older than the RetentionDays parameter.

This is all you need to start cleaning up the backup files in Azure, and although I have used Azure Analysis Services backups for my example here, this code can be used for any blob files in Azure.

Variables in DAX

Variables can simplify your DAX code, help with debugging and help improve performance. To use variables in your DAX measure you need to declare a variable using the VAR keyword, give the variable a name, and then assign an expression to the variable. You can use multiple variables in a measure but when using variables you must use a RETURN statement to specify the final output to be returned.

Variables syntax

To show how to make use of variables I’ll go through a simple measure and rewrite it using variables, I’ll be using the Adventure Works Internet Sales tabular cube for this.

A measure called Total Sales YoY Growth % (shown below) has been created which calculates sales for current year and the previous year, then works out the percentage increase/decrease from the previous year to the current year.

The example measure to be rewritten

You can see there is repeated logic for the CALCULATE block and this means it is evaluated twice, notice that SUM(‘Internet Sales'[Sales Amount]) is referenced 3 times. Using variables for the repeated logic gives the following code:

The original measure rewritten using variables

In this version the result of SUM(‘Internet Sales'[Sales Amount]) has been assigned to the variable TotalSales and the result of the CALCULATE section has been assigned to the variable TotalSalesPP. The CALCULATE section is now evaluated only once and assigned to a variable, this is a basic example so performance gain is insignificant but this method will help the performance of more complex measures.

To simplify the code further a new variable called TotalSalesVariance is created, and a final variable called Result is created to store the final calculation to be returned. The final code then becomes:

The final version with additional variables

This version then allows you to debug the measure by using any of the variables in the RETURN statement, making it possible to verify the values for each of the variables:

Variations of the same measure returning different variables for debugging

The one thing to look out for with variables is filter context, once a variable has been declared and a value assigned to it you cannot use a CALCULATE statement to override the filter context and get a new value using the variable.

For example, TotalSalesPP = CALCULATE(TotalSales, PARALLELPERIOD(‘Date'[Date], -12, MONTH)) would return the same value as the variable TotalSales, hence TotalSalesPP = CALCULATE(SUM(‘Internet Sales'[Sales Amount]), PARALLELPERIOD(‘Date'[Date], -12, MONTH)) is the correct way to write this.

Hopefully this blog will help you introduce variables into your DAX code!

Azure Analysis Services Cube Backup

This blog is a quick guide on how to back up an Azure Analysis Services cube using PowerShell. The PowerShell is used in a runbook inside an Automation account in Azure to automate the backup process. A pre-requisite for this is to configure the backup storage settings, this details the location of the backups.

Two parameters are passed into the runbook (shown below), where AnalysisServerName is the server name for the analysis services cube and AutomationCredentialName is the name of the credential for the Automation account in Azure.

The following variable is used to store the current date, obtained using the Get-Date cmdlet. This will be used as part of the backup name.

The command Get-AzureAutomationCredential is used to get the credentials for the Automation account provided, in this case it is one of the parameters provided for the runbook. The result is saved to a variable that will be used as a parameter for the backup command.

And finally, the backup command completes the backup of the analysis services cube to a file. Several parameters are used in this command: -BackupFile is the name given to the backup file, -Name is the name of the cube being backed up, -Server is the server name (passed in as a parameter), -Credential is the credentials obtained in the previous step, and -ApplyCompression enables the backup with compression.

I hope you have found this useful! The follow up to this blog about cleaning up the backup files can be found here.

Tabular Cube Processing Report

I have created a Power BI report which provides detail on the state of processing in a Tabular Cube.
The report uses the cube’s dynamic management views to provide information about the cube’s partitions.

To use the tabular cube processing report, you need to insert the following information:

  • Server
  • Database (Cube Name)

Once entered and assuming the connection is fine you need to accept some native queries. These statements are select statements and will not alter the cube. That should give you a report similar to the one below, I have used an adventure works tabular model as an example.

Click here to download the report

This report is inspired by one I found for general tabular cube documentation by Data Savvy, which I highly recommend. That report includes information on: tables, columns, relationships, measure and security:

Click here to view tabular cube documentation blog post

My experience in obtaining an MCSE

Background

I have recently completed my MCSE in Data Management and Analytics, and I wanted to share my experience of working towards and passing the exams that have led to me getting this MCSE. In this post I will cover some of the challenges I faced and provide some tips and advice, hopefully others following the same certification path, or other paths, will find this useful.

I am a business intelligence developer at Purple Frog, I have several years of experience working with data, I started off as a data analyst and then went into reporting and more recently have been working on ETLs, data warehousing and cubes. I have been working with SQL throughout my various roles and therefore the best place for me to start my certification path was with the exam 70-761: Querying Data with Transact-SQL. The path I’ve taken up to this point is:

MCSA: SQL 2016 Database Development
– 70-761: Querying Data with Transact-SQL
– 70-762: Developing SQL Databases

MCSE: Data Management & Analytics
– 70-767: Implementing a SQL Data Warehouse (from the Elective Exam Pool)

The learning material

Although the first exam (70-761) fitted in quite well with my SQL background (I probably knew about 75% of the material beforehand), there was still some work required for me to get to the stage where I felt I was confident in passing the exam. For me, the best resource and my primary resource for learning the material has been the Exam Ref books, so for example “Exam Ref 70-761 Querying Data with Transact-SQL”. These books are structured in a way that the content is split into the sections covered by the exam, for example the books contain a chapter for every skill covered in the exam.

The one downside to the Exam Ref books is that at times it can feel quite wordy if you’re relying on the book alone, so what I found really useful was to supplement this with videos and demos on the topics where I needed a greater understanding. In addition to this, practice and doing exercises helped me to further understand the different concepts as I was able to try what I had learnt and see where I was going wrong.

The final resource that I found useful was Microsoft Docs (https://docs.microsoft.com/en-us/), this is a really good reference point for key facts, for example I found the page on CDC really useful for my latest exam (70-767).

The exam

There are the obvious tips such as sleep early the night before, get to the exam centre with time to spare and so on, but I wanted to share some of the exam techniques I found useful while doing these exams.

My top tip is check that you have covered off and are comfortable with all the skills measured in the exam, the skills measured for each exam can be found in the “Browse Certifications and Exams” section on the Microsoft Learning website (example shown below for exam 70-761). The skills are also stated in the Exam Ref books and as mentioned before the chapters in the book are based on the skills measured in the exam.

This was taken from https://docs.microsoft.com/en-us/learn/certifications/exams/70-761

What’s useful about the skills measured shown above is that it shows the weight of questions per skill in the exam. This is useful because you can work out if you need to focus on a weaker area if that area is a big part of the exam.

Time shouldn’t be an issue in the exam if you’ve prepared well, however some questions are not worded in the best way and can catch you out so do take the time to read each question properly, and do keep an eye on the time remaining after every 5-10 questions.

You have the option to flag questions and review them again later (note some questions cannot be flagged), make use of these flags for questions you are unsure of. This can be particularly useful if you’ve flagged a question and then a later question gives you a clue or reminds you of the answer for the question flagged earlier. Alternatively, you should be provided with a pen and wipeable board where you can make notes so note down the question number and topic so that you can come back to it later.

Next steps

I am currently studying towards the exam 70-768: Developing SQL Data Models, this will help develop my understanding and knowledge of data modelling and working with cubes and will also help me get the certification for MCSA: SQL 2016 BI Development. With these current certifications being retired in the near future the next plan is to work towards the certification Microsoft Certified: Azure Data Engineer Associate.

I hope you have found this blog useful and that you can use some of the tips mentioned in your study plans, all the best!

SSAS Tabular Calculation Groups – avoid using SELECTEDMEASURE or ISSELECTEDMEASURE

Introduction:

There is a very serious limitation in the behaviour of calculation groups when using the SELECTEDMEASURE or ISSELECTEDMEASURE functions, and we recommend not using them. Why? If a user creates their own custom calculations within their Power BI report (or composite model) then the value of SELECTEDMEASURE changes, breaking your calculation group logic.

Let me explain with an example:

In a tabular cube we may look to use calculation groups to apply additional filters when calculating measures. In our example we will be using a retail scenario, with budgets as well as ‘Like for Like’ (LFL) budgets. For those not in retail, LFL represents stores that were also open this time last year. We start with the measure “Budget” with the following definition:

We also have a hidden measure called “Budget LFL” with the following definition:

The measure “Budget LFL” is a variation of the “Budget” measure with additional filters applied at source. We also have a calculation group called “LFL Filter”, this is used to return the hidden “Budget LFL” measure based on the option selected in this filter and has the following definition:

This functionality is demonstrated below where the two visuals from Power BI show the same measure with and without the LFL Filter applied:

No LFL filter applied
LFL filter applied

Problem:

A problem arises when you try to create a custom measure (in the cube or within a Power BI report) and filter by the calculation group, for example we create a measure called “CustomBudget” which is a copy of the “Budget” measure and has the definition:

Adding this custom measure to the visual shown earlier we can see that the calculation group “LFL Filter” has no affect on the custom measure:

LFL filter applied

This is because the SELECTEDMEASURE() is now [CustomBudget] and not [Budget], therefore the logic in the calculation group doesn’t recognise the selected measure, and therefore doesn’t switch to the LFL measure.

Workaround:

To get around this we move the bulk of the logic from the calculation group to the measure itself. The measure now changes its behaviour by looking at the selected value of the LFL filter, instead of the calculation group managing it centrally:

This is the new definition for the “Budget” measure
This is the new definition for the “LFL Filter” calculation group

We refresh Power BI and find that the results are now as expected, the original and custom measures now match when using the calculation group as a filter:

This is the same visual but using the new logic, now with matching results

Thank you to Darren Gosbell (Twitter | Blog) for the suggestion of this workaround.

Cube deployment error: Cannot resolve all paths while de-serializing Database

I recently came across the following error while deploying a tabular cube using the Analysis Services Deployment Wizard:

My updates for this deployment included removing a table from the cube. As part of the cube deployment options I chose to “Retain roles and members”. However the issue here was that an existing role was referencing the table I had deleted, for example the below image shows the Role Manager in Visual Studio and shows that the Cube_Finance role references the “Transaction Type” table.

To resolve this error I simply had to deploy the cube using the option “Deploy roles and retain members”:

This resulted in a successful deployment with the role updated to exclude any references to the deleted table:

Note this error can also occur if a table has been renamed, for the same reason explained above where a role references the “old” table name. The solution is the same: deploy roles and retain members.

SSAS Tabular Deployment Wizard fails Newtonsoft.Json error

The Scenario

Deploying an Analysis Services Tabular model to SSAS Azure using the Analysis Services Deployment Wizard. Both Visual Studio 2017 & SQL Server 2017 installed on the client.

Try and click on the ellipses to change the data source connection string or impersonation information results in a Newtonsoft.json error:

“Could not load file or assembly ‘Newtonsoft.Json, Version 6.0.0.0, Culture=neutral, ……”

 

The Solution

If you look at the folder containing Newtonsoft.Json.dll (2C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\Extensions\Application” – substitute 140 for your version of SQL Server) and then right click properties on the Newtonsoft.Json.dll file you’ll probably find that it’s version 10.0.3 or later, but the error message shows that the deployment wizard is looking for version 6.0.x. We therefore have to tell the SSAS Deployment Wizard to use a newer version instead.

Use Notepad(++) to open the Microsoft.AnalysisServices.Deployment.exe.config file, which you should find in “C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio” (substitute 140 as above, for your correct version).

Find the section for Newtonsoft.Json

And make a couple of changes (backup the file first!!!)

  • Add in a bindingRedirect line, redirecting 6.0.0.0 to 10.0.0.0
  • Change the codeBase version number to match the actual file version, 10.0.0.0

Save and close the file, relaunch the deployment wizard, and you should find it all works ok.

Frog-Blog Out

 

Showing December to December to see opening values for a year in SSAS/MDX

I came across an interesting MDX challenge this week; within a cube’s Date dimension, how to show December twice, once where it should be and again as the opening value for the following year. i.e. for each year I need to show Dec (prev yr), Jan, …, Nov, Dec.

Why? Well if you consider the following pivot chart, you can clearly see growth from Jan to Feb, Feb to Mar, etc., but it’s very difficult to see the growth between Dec and Jan.

So to make things easier to a user we want to repeat the Dec value and insert it before Jan as an opening value for the year. Now we can’t duplicate December’s data in the DW or Cube for obvious reasons, so we need to create a virtual copy of December using some MDX.

Step 1 – Create a new virtual month member called ‘Open’

1
CREATE MEMBER CURRENTCUBE.[Date].[Month Name].[Open] AS null;

Step 2 – Create a named set ‘Dec to Dec’ that includes Open as well as Jan-Dec

1
2
3
4
CREATE STATIC SET CURRENTCUBE.[Dec to Dec]
AS {[Date].[Month Name].[Open]
,[Date].[Month Name].[Month Name].[Jan]
:[Date].[Month Name].[Month Name].[Dec]};

Step 3 – Set the ‘Open’ member to show the value for ‘Dec’ in the previous year

1
2
3
SCOPE ([Date].[Month Name].[Open]);
THIS = ([Date].[Year].PREVMEMBER, [Date].[Month Name].[Dec]);
END SCOPE;

 

When you then select the ‘Dec to Dec’ named set in Excel we do indeed see 13 months per year, including our new ‘Open’ month. There are two problems however:

  1. Open appears at the end not the start. To fix this open up the ‘Field Settings’ of the ‘Dec to Dec’ field in the pivot table, select the ‘Layout and Print’ tab and untick ‘Automatically order and remove duplicates from the set’
  2. If you filter the pivot to only show a single year, Excel decides to hide the ‘Open’ member (it is Excel, as MDX works fine!). To fix this go to ‘PivotTable Options’ then on the ‘totals and Filters’ tab, untick ‘Include filtered items in set totals’.

Hey presto, your pivot table will now look something like this:

You can now clearly see the growth in Dec to Jan, and everyone’s happy.

<Frog-Blog Out>

MDX Compare or Rank Similar Members

Or should this be called: “Recreating DAX’s EARLIER function in MDX” – either way, a useful technique that solves a problem…

MDX makes it very easy for us to compare one member against others, using functions such as RANK() etc. But how do we dynamically compare a member against a subset of other members? I came across a customer requirement recently where we had to rank a member against other members that had similar properties, based on other measures. This will probably make more sense with an example…

Taking AdventureWorks, we can easily calculate the sales rank of each product using something like this:

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,[Product].[Product].[Product].MEMBERS
       ,[Measures].[Internet Sales Amount])
SELECT {[Measures].[Internet Sales Amount], [Measures].[Rank]} ON 0
      ,NONEMPTY({[Product].[Product].[Product].MEMBERS}
                ,[Measures].[Internet Sales Amount]) ON 1
FROM [Adventure Works]
WHERE [Date].[Calendar Year].&[2008]

MDXRank1

This ranks each product against all other products. But is that a fair comparison? Comparing sales of a water bottle against a top of the range racing bike. Not really. So how about we instead rank each product against all products within +/-20% of the same cost. So if the water bottle costs £20, then we would rank its sales against all products with a cost between £16 and £24. This gives a more accurate idea of how well each product is performing compared to its peers.

Although to keep the MDX simple here, let’s just say any product within £20.

In AdventureWorks we have [Measures].[Internet Average Unit Price], which can be used to determine comparable products. So how do we go about achieving this?

If we look at the RANK function, it takes three parameters; the member being ranked, the set over which to rank, and the measure to be used.

All we have to do is filter the second parameter, the set over which to rank, to include similar members. So maybe something like this:

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,FILTER([Product].[Product].[Product].MEMBERS
              ,ABS([Measures].[Internet Average Unit Price]
               -([Measures].[Internet Average Unit Price]
                ,[Product].[Product].CURRENTMEMBER))
               <=20
               )
       ,[Measures].[Internet Sales Amount])

If we break this down, we’re just changing the 2nd parameter to be a filtered set, where the unit price is within £20 of the unit price of the current member. This should work right?

Unfortunately, wrong. The results look exactly the same as the original rank – nothing has changed.

MDXRank2

The problem here is that CURRENTMEMBER is within the filter function, so it changes context to refer to whatever row is being considered at the time by the filter function. So [Measures].[Internet Average Unit Price] and ([Measures].[Internet Average Unit Price],[Product].[Product].CURRENTMEMBER) are always the same product, and no rows are filtered out. CURRENTMEMBER does NOT refer to the current member being considered by the RANK function, but by the FILTER function.

In DAX we have the EARLIER and EARLIEST functions, which would be great here, and would allow us to step out of the current context into the previous calculation layer. But unfortunately we haven’t been blessed with an MDX EARLIER function. So how do we fix this in MDX?

The trick here is dynamic sets, using the STRTOSET function. This allows us to grab the member being ranked, and treat it as a fixed member within the FILTER function.

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,STRTOSET('
          FILTER([Product].[Product].[Product].MEMBERS
                ,ABS([Measures].[Internet Average Unit Price]
                 -' + CSTR([Measures].[Internet Average Unit Price]) + ')
                 <=20
                )'
       )
       ,[Measures].[Internet Sales Amount])

We building up a string, which will fix the value of [Measures].[Internet Average Unit Price] to that of the product being ranked, and will then dynamically compare it to the value of [Measures].[Internet Average Unit Price] for all other products. Those within £20 will be included in the resulting set, and will be used to rank the original product.

MDXRankYou can see the result in the screenshot above, where the ranking is dependent on the average unit price.

Frog-Blog Out

1 2 3 5
Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Lewis Prince
Reiss McSporran
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon