Whilst investigating a slow set of Excel pivot tables (connected to a multidimensional SSAS cube), I found some odd behaviour in how Excel generates its MDX, sometimes doing far more work than if required and causing significant performance reduction.
Take the following example, just querying Customer Count by Promotion, against the AdventureWorks cube.
The profile trace (just using “Query Begin”, “Query End” and “Query Subcube Verbose”) shows that the query runs two “Query Subcube Verbose” calls, which means two separate calls to the SSAS storage engine.
The first Query Subcube event requests “0” for Promotion, which is the [All] member, and so is used as the total in the pivot table. The second queries “*”, which returns every member in the Promotion attribute, which in this example returns values for each of the 4 different promotions.
This makes sense, as for a distinct count measure, you can’t simply add up the components, you do need to calculate the subtotal or grand total separately. The query just takes twice as long as a result.
So lets say that the total is irrelevant to the user, so they disable totals within Excel…
Then we refresh the pivot, we should only see a single storage engine call. Right? Wrong.
There are still two calls to the storage engine, and SSAS is still calculating the grand total for the query, which Excel is then just discarding.
In a small cube this may make very little noticeable difference to the pivot performance. But on a multi-terrabyte cube with billions of records, distinct count operations can take minutes to run. And calculating the non-required totals can make a significant difference.
This becomes even worse when multiple attributes are stacked on the rows and/or columns. Each new attribute requires its own totals calculating, resulting in more and more calls to the storage engine. In the following pivot I’ve added an extra two attributes to the rows, with no subtotals or grand totals.
The following trace shows that 8 Subcube queries were performed.
These correspond to detail and totals for each combination of attributes, which calculates as 2^n storage queries, where n is the number of attributes selected on the rows/columns of a pivot.
Now it’s common practice for users to want to create pivots with 4, 5, 6+ attributes on rows/columns, and the workload on the cube increases exponentially with each one they add. This is not good!
There are two workarounds to this:
1) Forcibly disable totals for relevant dimensions in the cube.
2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.
Let’s look at these in more detail…
1) Forcibly disable totals for relevant dimensions in the cube.
Let’s say that for a particular cube, customer counts were only ever calculated on the monthly level, and it didn’t make sense to aggregate customer counts up to a total across all months/years. We could just add some MDX into the Calculation script to disable date totals for the relevant measure(s).
SCOPE([Measures].[Customer Count], [Date].[Calendar].[All]); THIS=null; END SCOPE;
Excel will still produce an MDX query asking for the date totals, but SSAS will just ignore it and return null. Note that this obviously only works where there is no business sense in asking for the total.
2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.
Excel 2013 introduced an ability to customise the MDX of a pivot. On the Analyze ribbon, click ‘Fields, Items & Sets’, and then ‘Create Set Based on Row Items…’You can then remove any totals or rows that you don’t want, including the subtotals and grand total.
Or if you’re an MDX guru, you can click on ‘Edit MDX’ and write your own sets for even more performance, and also to make it more dynamic.
When we run this, we get the same output in Excel, but the profile trace now shows:
Just a single call to the storage engine, instead of 16. 1/16 of the work for the same results.
Now beware that once you do this, you’re essentially hard coding that part of the pivot, so in this example any new promotions would not automatically show in the pivot unless they were added. That;s why learning some MDX and writing a more dynamic set is preferable.
But the obvious answer to this is, please, please Microsoft, fix the terrible MDX that Excel creates! If you agree, please upvote the Connect item requesting that it be improved.
Frog-Blog Out
I’m a newbie. I’m looking for approach to resolve slow issue of excel pivot connect SQL 2016 cube. Please tell me how to fire up the profile trace. Thanks.
In your start menu, run “SQL Server 2016 Profiler” (or whatever version of SQL you have). Click “File -> New Trace” then change the server type to “Analysis Services” and connect to your cube.
You can then choose the events that you want to monitor.
Regards
Alex
Thanks. The PC running excel 2016 pivot is Windows 10 and the cube server is Windows 2016. I wonder if you could suggest some customized MDX which will help for release the tension generated by slow issue. Thanks.