0845 643 64 63

Dashboarding and Reporting

Creating calendar tables with DAX using Power BI

By Nick Edwards

In this blog post we’ll take a quick look at creating a self-generating calendar table using DAX.

Dates are important if we want to perform time intelligence reporting on our data i.e. yearly sales, monthly sales, weekly sales, year to date sales or previous year sales.

We’ll be using the calendar function to create our date table, but there are other methods to do this such as CALENDARAUTO or GENERATESERIES.

Here is the syntax we’ll be using to generate our date table.

CALENDAR(<start_date>,<end_date>)

The calendar function returns a single column called “Date” which generates a continuous series of dates from the specified start date to the specified end date. So if we specified the start date to be the 01/01/2020 and the end date to be 31/12/2020 the function would generate 366 rows of distinct date data.

We can then use the add columns function to expand our calendar table further to add specific columns we wish to slice our data by i.e. year, month, quarter, week number, day…

Here is the syntax we will be using to expand our calender table.

ADDCOLUMNS(<table>,<name>,<expression>,[<name>,<expression>]…)

Let’s use an example to further explore how these functions work in practice using some sample adventure works sales data. Here we use the “Get Data” icon to directly query our adventure works database and bring over a sample of sales data, named “AW_Sales”.

Now, to create our calendar table we need to click “New Table” in the modeling tab and enter the following function.

Note: I like to use variables in my date table just to keep the DAX looking clean and fuss free, but this isn’t necessary.

For the start date parameter we have used the FIRSTDATE() and for the end Date parameter we have used the LASTDATE() function. This is so we can extract the first and last “OrderDate” from our AW_Sales table. We could have also used the MIN() and MAX() function to deliver the same results using the newly created “Date” column. As we can see this has generated a sequential date list from the 01/01/2012 to the 31/12/2013 with 731 distinct rows of data.

Now we want to expand our calendar table using the newly created “Date” column with new columns which can slice and dice our data. To do this I use the ADDCOLUMNS function. Here I have added Year, Quarter, Year Month, Month Number, Month Name, Day Of Year, Day Of Month, Day Of Week and Day Name as columns.

Once we’ve done this, we will mark our newly created table as a date table to allow Power BI to recognize date hierarchies and time intelligence functions.

We can now view our newly generated calender table in the data view. Instantly when any new data enters our model from “AW_Sales”, the calendar table will expand accordingly due to the last date function used above.

Now it’s just a case of creating a one to many relationship between our new calendar table and our “AW_Sales” table. We’ll create a one to many relationship between “Calendar[Date]” and “AW_Sales[OrderDate]” as shown below.

Congratulations we have now created a fully fledged calendar table that can slice and dice our “AW_Sales” tale by any of the columns we have created in our calendar table, as shown in the example below.

First thoughts on DataZen

Following the recent announcement of Microsoft acquiring DataZen, I’ve been having a play around to see what it can and can’t do. Here’s a very brief summary so far:

The Good:

  • Very quick and easy design interface
  • Fantastic way of modifying dashboard layout for tablets and mobile devices
  • It seems to ‘just work’ very well
  • You don’t have to worry about the pixel perfect layout of what’s where, it takes care of it for you

Screen shots of the same dashboard running on an iPhone 6 and in a web browser:

datazen01
datazen02

The Bad:

  • The designer is Windows 8 only. Really?!
  • Yes it technically connects to SSAS cubes, but the interface is quite frankly no more than smoke and mirrors.  It connects in the same way as QlikView does; it seems that you have to write an MDX query to return an entire fact table at whatever level of granularity you want to be able to filter by. DataZen then pulls back all of the data and then re-aggregates this for your dashboard. This may be ok for a small cube, but with a large measure group that needs to be sliced/diced by a number of different attributes at the same time this quickly becomes a bad idea. It is not able to use the power of SSAS, it treats it as simply another flat data source.
  • Scalability seems to be limited. For example you can provide a flat dataset containing a key and a parent key, and DataZen will turn it into a tree filter for you. Except that I tried this with a 7000 node parent child hierarchy in a tree and it just couldn’t cope. It did eventually load after numerous attempts, but it was so unresponsive that it was unusable. It seemed to work ok with a few hundred nodes.
  • You don’t have to worry about the pixel perfect layout of what’s where, it takes care of it for you. Yes this is a good point, but it also unfortunately means that there is little scope for customisation. This gets frustrating when it doesn’t ‘just work’. For example resizing a chart with a legend, in some sizes on the mobile view the legend was so big it left no space for the chart. It would be nice to be able to turn off the legend for the mobile view or something similar.
  • SSAS calculation logic is not supported. Well, this is an extension of the smoke and mirrors SSAS implementation, but it’s particularly relevant in this point. One of our existing clients is a heavy dashboard user, with data sourced from SSAS multidimensional cubes. They have a number of KPIs defined, for which the target is the actual value of the previous period. Now SSAS takes care of this beautifully. If the user selects a month then the target is the previous month’s value. If they select a week then the target is the previous week’s value, etc. However DataZen provides SSAS with no context over what is selected, and so SSAS is not able to dynamically do its magic. Therefore KPI targets cannot be dynamic, they need to be static and fixed at the point of DataZen data refresh.
  • Dates from MDX queries don’t seem to want to hook into the Time Navigator filters, and there’s no way of forcing them to. Although I’m guessing that this is a problem with how I’m doing it, so probably not fair to include it here.

Summary

I’ve not been playing with the tool for very long, so I may find ways around all of this. However at the moment I’m concerned that Microsoft are placing their dashboard/analytics future on a product that doesn’t properly support SSAS. In my mind the core strengths of the MS Business Intelligence offering is underpinned by the power of cubes, backed up with SSIS and a the strength of the SQL Server database platform. To buy into a dashboarding platform that doesn’t support and build on this is a cause for concern.

My hope/expectation is that Microsoft take elements from PowerView (that does properly support realtime queries against a cube) and elements from ProClarity/PerformancePoint (e.g. the decomposition tree, etc.), and embed them into DataZen. In which case they could end up with an awesome product.

Yes I’m a cube guy, so I’m naturally focused on the poor SSAS integration. If you don’t use cubes then you’ll probably love it. But MS have some work to do to bring me around.

Anyway, time will tell what happens next… Lets keep fingers crossed

Frog-Blog-Out

Excel creates inefficient MDX

Whilst investigating a slow set of Excel pivot tables (connected to a multidimensional SSAS cube), I found some odd behaviour in how Excel generates its MDX, sometimes doing far more work than if required and causing significant performance reduction.

Take the following example, just querying Customer Count by Promotion, against the AdventureWorks cube.

ExcelMDX01

The profile trace (just using “Query Begin”, “Query End” and “Query Subcube Verbose”) shows that the query runs two “Query Subcube Verbose” calls, which means two separate calls to the SSAS storage engine.

The first Query Subcube event requests “0” for Promotion, which is the [All] member, and so is used as the total in the pivot table. The second queries “*”, which returns every member in the Promotion attribute, which in this example returns values for each of the 4 different promotions.

This makes sense, as for a distinct count measure, you can’t simply add up the components, you do need to calculate the subtotal or grand total separately. The query just takes twice as long as a result.

So lets say that the total is irrelevant to the user, so they disable totals within Excel…

Then we refresh the pivot, we should only see a single storage engine call. Right? Wrong.

ExcelMDX04There are still two calls to the storage engine, and SSAS is still calculating the grand total for the query, which Excel is then just discarding.

In a small cube this may make very little noticeable difference to the pivot performance. But on a multi-terrabyte cube with billions of records, distinct count operations can take minutes to run. And calculating the non-required totals can make a significant difference.

This becomes even worse when multiple attributes are stacked on the rows and/or columns. Each new attribute requires its own totals calculating, resulting in more and more calls to the storage engine. In the following pivot I’ve added an extra two attributes to the rows, with no subtotals or grand totals.

ExcelMDX05The following trace shows that 8 Subcube queries were performed.

ExcelMDX06These correspond to detail and totals for each combination of attributes, which calculates as 2^n storage queries, where n is the number of attributes selected on the rows/columns of a pivot.

Now it’s common practice for users to want to create pivots with 4, 5, 6+ attributes on rows/columns, and the workload on the cube increases exponentially with each one they add. This is not good!

There are two workarounds to this:

1) Forcibly disable totals for relevant dimensions in the cube.

2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.

 

Let’s look at these in more detail…

1) Forcibly disable totals for relevant dimensions in the cube.

Let’s say that for a particular cube, customer counts were only ever calculated on the monthly level, and it didn’t make sense to aggregate customer counts up to a total across all months/years. We could just add some MDX into the Calculation script to disable date totals for the relevant measure(s).

SCOPE([Measures].[Customer Count], [Date].[Calendar].[All]);
      THIS=null;
END SCOPE;

Excel will still produce an MDX query asking for the date totals, but SSAS will just ignore it and return null. Note that this obviously only works where there is no business sense in asking for the total.

 

2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.

Excel 2013 introduced an ability to customise the MDX of a pivot. On the Analyze ribbon, click ‘Fields, Items & Sets’, and then ‘Create Set Based on Row Items…’ExcelMDX07You can then remove any totals or rows that you don’t want, including the subtotals and grand total.

ExcelMDX08Or if you’re an MDX guru, you can click on ‘Edit MDX’ and write your own sets for even more performance, and also to make it more dynamic.

When we run this, we get the same output in Excel, but the profile trace now shows:

ExcelMDX09Just a single call to the storage engine, instead of 16. 1/16 of the work for the same results.

Now beware that once you do this, you’re essentially hard coding that part of the pivot, so in this example any new promotions would not automatically show in the pivot unless they were added. That;s why learning some MDX and writing a more dynamic set is preferable.

But the obvious answer to this is, please, please Microsoft, fix the terrible MDX that Excel creates! If you agree, please upvote the Connect item requesting that it be improved.

Frog-Blog Out

 

Microsoft Acquires Datazen

Exciting news in the Microsoft Analytics space. On 14th April 2015 Microsoft announced they are acquiring Datazen – what does this mean, and is it a good or a bad thing?

For all of PowerView’s good points, it’s still a very immature product. I’ve still not had a single customer take it seriously, aside from a few proof of concept implementations. It’s a good ‘personal analytics’ tool, but in my opinion it doesn’t yet hit the mark when it comes to an enterprise grade analytics front end.

PerformancePoint is still the only MS tool that offers a traditional dashboarding interface, and quite frankly it’s well past its best before date. With almost zero development in recent years it has been left behind as a clunky relic of days gone by. Although one that is still in surprisingly widespread use through no reason other than lack of alternative MS options.

With great 3rd party tools like Pyramid Analytics and Tableau out there, Microsoft have really dropped the ball on this. We’ve all been waiting for PowerView to mature and grow, but now Microsoft has announced that they’re acquiring Datazen, which could be a game changer.

I’ve not used the tool before, but I’ll be downloading it asap and will report back with findings in due course. It ticks a number of boxes; multiple data sources including OLAP, SQL, etc, mobile friendly (apparently), interactivity, KPIs, maps, etc.

The key question for me is how will Microsoft approach SSAS cube development with Datazen. Will they maintain and develop full support for multidimensional cubes, or will they prioritise tabular cube functionality. Time will tell.

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Reiss McSporran
Jeet Kainth
Jon Fletcher
Nick Edwards
Liam McGrath

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon