0845 643 64 63

MDX

Showing December to December to see opening values for a year in SSAS/MDX

I came across an interesting MDX challenge this week; within a cube’s Date dimension, how to show December twice, once where it should be and again as the opening value for the following year. i.e. for each year I need to show Dec (prev yr), Jan, …, Nov, Dec.

Why? Well if you consider the following pivot chart, you can clearly see growth from Jan to Feb, Feb to Mar, etc., but it’s very difficult to see the growth between Dec and Jan.

So to make things easier to a user we want to repeat the Dec value and insert it before Jan as an opening value for the year. Now we can’t duplicate December’s data in the DW or Cube for obvious reasons, so we need to create a virtual copy of December using some MDX.

Step 1 – Create a new virtual month member called ‘Open’

1
CREATE MEMBER CURRENTCUBE.[Date].[Month Name].[Open] AS null;

Step 2 – Create a named set ‘Dec to Dec’ that includes Open as well as Jan-Dec

1
2
3
4
CREATE STATIC SET CURRENTCUBE.[Dec to Dec]
AS {[Date].[Month Name].[Open]
,[Date].[Month Name].[Month Name].[Jan]
:[Date].[Month Name].[Month Name].[Dec]};

Step 3 – Set the ‘Open’ member to show the value for ‘Dec’ in the previous year

1
2
3
SCOPE ([Date].[Month Name].[Open]);
THIS = ([Date].[Year].PREVMEMBER, [Date].[Month Name].[Dec]);
END SCOPE;

 

When you then select the ‘Dec to Dec’ named set in Excel we do indeed see 13 months per year, including our new ‘Open’ month. There are two problems however:

  1. Open appears at the end not the start. To fix this open up the ‘Field Settings’ of the ‘Dec to Dec’ field in the pivot table, select the ‘Layout and Print’ tab and untick ‘Automatically order and remove duplicates from the set’
  2. If you filter the pivot to only show a single year, Excel decides to hide the ‘Open’ member (it is Excel, as MDX works fine!). To fix this go to ‘PivotTable Options’ then on the ‘totals and Filters’ tab, untick ‘Include filtered items in set totals’.

Hey presto, your pivot table will now look something like this:

You can now clearly see the growth in Dec to Jan, and everyone’s happy.

<Frog-Blog Out>

MDX Compare or Rank Similar Members

Or should this be called: “Recreating DAX’s EARLIER function in MDX” – either way, a useful technique that solves a problem…

MDX makes it very easy for us to compare one member against others, using functions such as RANK() etc. But how do we dynamically compare a member against a subset of other members? I came across a customer requirement recently where we had to rank a member against other members that had similar properties, based on other measures. This will probably make more sense with an example…

Taking AdventureWorks, we can easily calculate the sales rank of each product using something like this:

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,[Product].[Product].[Product].MEMBERS
       ,[Measures].[Internet Sales Amount])
SELECT {[Measures].[Internet Sales Amount], [Measures].[Rank]} ON 0
      ,NONEMPTY({[Product].[Product].[Product].MEMBERS}
                ,[Measures].[Internet Sales Amount]) ON 1
FROM [Adventure Works]
WHERE [Date].[Calendar Year].&[2008]

MDXRank1

This ranks each product against all other products. But is that a fair comparison? Comparing sales of a water bottle against a top of the range racing bike. Not really. So how about we instead rank each product against all products within +/-20% of the same cost. So if the water bottle costs £20, then we would rank its sales against all products with a cost between £16 and £24. This gives a more accurate idea of how well each product is performing compared to its peers.

Although to keep the MDX simple here, let’s just say any product within £20.

In AdventureWorks we have [Measures].[Internet Average Unit Price], which can be used to determine comparable products. So how do we go about achieving this?

If we look at the RANK function, it takes three parameters; the member being ranked, the set over which to rank, and the measure to be used.

All we have to do is filter the second parameter, the set over which to rank, to include similar members. So maybe something like this:

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,FILTER([Product].[Product].[Product].MEMBERS
              ,ABS([Measures].[Internet Average Unit Price]
               -([Measures].[Internet Average Unit Price]
                ,[Product].[Product].CURRENTMEMBER))
               <=20
               )
       ,[Measures].[Internet Sales Amount])

If we break this down, we’re just changing the 2nd parameter to be a filtered set, where the unit price is within £20 of the unit price of the current member. This should work right?

Unfortunately, wrong. The results look exactly the same as the original rank – nothing has changed.

MDXRank2

The problem here is that CURRENTMEMBER is within the filter function, so it changes context to refer to whatever row is being considered at the time by the filter function. So [Measures].[Internet Average Unit Price] and ([Measures].[Internet Average Unit Price],[Product].[Product].CURRENTMEMBER) are always the same product, and no rows are filtered out. CURRENTMEMBER does NOT refer to the current member being considered by the RANK function, but by the FILTER function.

In DAX we have the EARLIER and EARLIEST functions, which would be great here, and would allow us to step out of the current context into the previous calculation layer. But unfortunately we haven’t been blessed with an MDX EARLIER function. So how do we fix this in MDX?

The trick here is dynamic sets, using the STRTOSET function. This allows us to grab the member being ranked, and treat it as a fixed member within the FILTER function.

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,STRTOSET('
          FILTER([Product].[Product].[Product].MEMBERS
                ,ABS([Measures].[Internet Average Unit Price]
                 -' + CSTR([Measures].[Internet Average Unit Price]) + ')
                 <=20
                )'
       )
       ,[Measures].[Internet Sales Amount])

We building up a string, which will fix the value of [Measures].[Internet Average Unit Price] to that of the product being ranked, and will then dynamically compare it to the value of [Measures].[Internet Average Unit Price] for all other products. Those within £20 will be included in the resulting set, and will be used to rank the original product.

MDXRankYou can see the result in the screenshot above, where the ranking is dependent on the average unit price.

Frog-Blog Out

Excel creates inefficient MDX

Whilst investigating a slow set of Excel pivot tables (connected to a multidimensional SSAS cube), I found some odd behaviour in how Excel generates its MDX, sometimes doing far more work than if required and causing significant performance reduction.

Take the following example, just querying Customer Count by Promotion, against the AdventureWorks cube.

ExcelMDX01

The profile trace (just using “Query Begin”, “Query End” and “Query Subcube Verbose”) shows that the query runs two “Query Subcube Verbose” calls, which means two separate calls to the SSAS storage engine.

The first Query Subcube event requests “0” for Promotion, which is the [All] member, and so is used as the total in the pivot table. The second queries “*”, which returns every member in the Promotion attribute, which in this example returns values for each of the 4 different promotions.

This makes sense, as for a distinct count measure, you can’t simply add up the components, you do need to calculate the subtotal or grand total separately. The query just takes twice as long as a result.

So lets say that the total is irrelevant to the user, so they disable totals within Excel…

Then we refresh the pivot, we should only see a single storage engine call. Right? Wrong.

ExcelMDX04There are still two calls to the storage engine, and SSAS is still calculating the grand total for the query, which Excel is then just discarding.

In a small cube this may make very little noticeable difference to the pivot performance. But on a multi-terrabyte cube with billions of records, distinct count operations can take minutes to run. And calculating the non-required totals can make a significant difference.

This becomes even worse when multiple attributes are stacked on the rows and/or columns. Each new attribute requires its own totals calculating, resulting in more and more calls to the storage engine. In the following pivot I’ve added an extra two attributes to the rows, with no subtotals or grand totals.

ExcelMDX05The following trace shows that 8 Subcube queries were performed.

ExcelMDX06These correspond to detail and totals for each combination of attributes, which calculates as 2^n storage queries, where n is the number of attributes selected on the rows/columns of a pivot.

Now it’s common practice for users to want to create pivots with 4, 5, 6+ attributes on rows/columns, and the workload on the cube increases exponentially with each one they add. This is not good!

There are two workarounds to this:

1) Forcibly disable totals for relevant dimensions in the cube.

2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.

 

Let’s look at these in more detail…

1) Forcibly disable totals for relevant dimensions in the cube.

Let’s say that for a particular cube, customer counts were only ever calculated on the monthly level, and it didn’t make sense to aggregate customer counts up to a total across all months/years. We could just add some MDX into the Calculation script to disable date totals for the relevant measure(s).

SCOPE([Measures].[Customer Count], [Date].[Calendar].[All]);
      THIS=null;
END SCOPE;

Excel will still produce an MDX query asking for the date totals, but SSAS will just ignore it and return null. Note that this obviously only works where there is no business sense in asking for the total.

 

2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.

Excel 2013 introduced an ability to customise the MDX of a pivot. On the Analyze ribbon, click ‘Fields, Items & Sets’, and then ‘Create Set Based on Row Items…’ExcelMDX07You can then remove any totals or rows that you don’t want, including the subtotals and grand total.

ExcelMDX08Or if you’re an MDX guru, you can click on ‘Edit MDX’ and write your own sets for even more performance, and also to make it more dynamic.

When we run this, we get the same output in Excel, but the profile trace now shows:

ExcelMDX09Just a single call to the storage engine, instead of 16. 1/16 of the work for the same results.

Now beware that once you do this, you’re essentially hard coding that part of the pivot, so in this example any new promotions would not automatically show in the pivot unless they were added. That;s why learning some MDX and writing a more dynamic set is preferable.

But the obvious answer to this is, please, please Microsoft, fix the terrible MDX that Excel creates! If you agree, please upvote the Connect item requesting that it be improved.

Frog-Blog Out

 

MDX Between Start Date and End Date

How do you use MDX to find the number of items between two dates, when the start date and end date are different role playing dimensions?

I had an interesting question from Ricardo Santos on another frog-blog post this week (MDX Calculated Member spanning multiple date dimensions). I answered Ricardo in the post, but it was a good question which I thought deserved a full blog post on its own. So thanks to Ricardo for prompting this.

The Scenario

If you have a ‘Project’ fact table, which has two dates (start date and end date), how do you find out how many projects are active on any given date. In this scenario we would have three role playing date dimensions; Date, Start Date & End Date.

We want the user to be able to select a date (from the Date dimension) and then use that to filter the projects using the other two date dimensions.

This also applies to many other scenarios; accomodation occupancy, staffing levels, contracts, really anything with a start and end date.

The MDX answer

If there was just a single date dimension (i.e. a typical sales fact) then you could just filter it

1
  ([Measures].[Internet Sales Amount], [DATE].[Calendar].&[20080110])

But what we need to do is combine two different date dimensions. In SQL, to find the number of active projects on 1st April 2013 we’d just use

1
2
3
4
   SELECT COUNT(*)
   FROM Projects
   WHERE StartDate >= 20130401
     AND EndDate   <= 20130401

The equivalent in MDX is more complex as we have to filter a fact by two different dimensions. Starting with the [Start Date] dimension, we need to sum the project count for the projects with a start date on or before the required date:

1
    AGGREGATE({NULL:[START DATE].[Calendar].[20130401]}, [Measures].[Project COUNT])

We also need to then find the project count for all projects with an end date on or after the required date

1
    AGGREGATE({[END DATE].[Calendar].[20130401]:NULL}, [Measures].[Project COUNT])

We can then use crossjoin (*) to combine them together.

1
2
3
    AGGREGATE( {NULL:[START DATE].[Calendar].[20130301]}
             * {[END DATE].[Calendar].[20130301]:NULL}
             , [Measures].[Project COUNT])

Note that you could also use FILTER to achieve the same goal.

This will find the project count for a specific date, but we now need to make the date dynamic. We want the calculation to use a date selected by the user from a 3rd [Date] dimension, applying this date to the other two dimensions (start date & end date).

Enter the LINKMEMBER function. This is a great tool in your MDX arsenal, allowing you to identify a member of one dimension from a selection in a different role playing dimension. e.g.

1
     LINKMEMBER([DATE].[Calendar].[20130414], [START DATE].[Calendar])

The example above will return the following member of the Start Date dimension.

1
     [START DATE].[Calendar].[20130414]

When we integrate this into the previous MDX, we get:

1
2
3
4
5
6
    AGGREGATE( 
          {NULL:LINKMEMBER([DATE].[Calendar].CURRENTMEMBER
                     ,[START DATE].[Calendar])}
        * {LINKMEMBER([DATE].[Calendar].CURRENTMEMBER
                    , [END DATE].[Calendar]):NULL}
        , [Measures].[Project COUNT])

This will take the CURRENTMEMBER of the [Date].[Calendar] hierarchy, and map it to the equivalent members in the [Start Date].[Calendar] and [End Date].[Calendar] hierarchies, before then using the resulting members to filter the Project Count measure.

And there you have it.

But, that’s not strictly the right answer…

Data modelling

Whenever we have to write anything complicated in MDX, we need to ask ourselves the question of why?… Complicated MDX is all too often a symptom of poor data modelling. Usually caused when a data warehouse is designed with SQL in mind, not cubes/MDX. If you’re planning to use cubes then you can reap huge benefits from remodelling your data accordingly. This could be in your ETL and warehouse, or you can just remodel your data in the cube data source view. Lets have a look at an example.

The DSV for the project fact we were looking at earlier would be something like this (simplified)

1
2
3
4
5
6
7
    SELECT 
       , ProjectManagerKey
       , CustomerKey
       , StartDateKey
       , EndDateKey
       , 1 AS ProjectCount
    FROM FactProject

So the project fact represents a project. A cube fact table should however represent an event, not an item (items should be dimensions, not facts). So we need to think of a project as two seperate events; a start and an end. Take a look at the query below

1
2
3
4
5
6
7
8
9
10
11
12
13
    SELECT 
       , ProjectManagerKey
       , CustomerKey
       , StartDateKey AS DateKey
       , 1 AS ProjectCount
    FROM FactProject
	UNION ALL
    SELECT 
       , ProjectManagerKey
       , CustomerKey
       , EndDateKey AS DateKey
       , -1 AS ProjectCount
    FROM FactProject

Each project will therefore exist twice, once with a +1 on the project start date, and once with a -1 on the project end date. And we only have a single date dimension to worry about.

So how do we use this in MDX?

You can find out the number of active projects on any given date using

1
2
    AGGREGATE({NULL:[DATE].[Calendar].CURRENTMEMBER}
            , [Measures].[Project COUNT])

i.e. the sum of all projects started/finished prior to the selected date.

To find out the change in the number of projects in any given time period

1
2
    ([DATE].[Calendar].CURRENTMEMBER
    ,[Measures].[Project COUNT])

In the SQL world this is an inefficient approach, but it’s perfectly suited to the way cubes work.

So by spending a little time thinking about the cube’s data model, we can dramatically simplify our MDX, and improve our performance.

Frog-blog out

Querying dimension members in MDX

Today we’re back looking at MDX with a quick how-to. How do you extract a list of dimension members using MDX?

This could be used for example to populate a user interface/parameter of a report/dashboard.

Let’s look at two scenarios; normal dimensions, and then hierarchies. All the queries here use the “Adventure Works DW 2008 R2” “Adventure Works” cube.

Standard dimension attributes

So how do we get a list of dimension members? Well lets divide the query up into rows and columns. We want the members listed on the rows.

1
2
3
4
SELECT
  xxx ON COLUMNS,
  [Geography].[Country].ALLMEMBERS ON ROWS
FROM [Adventure Works]

Why use .ALLMEMBERS instead of .MEMBERS? .ALLMEMBERS will include calculated members, whereas .MEMBERS won’t.

So what do we put on the rows? We want the display name for showing to the user, and the unique name so we can refer back to each exact member. We can get these by using properties of the current member, namely MEMBER_CAPTION and UNIQUENAME. Get a full list of dimension member properties here. To make this work, we just define a number of query-scoped calculated members using WITH, and then include them in the columns set…

1
2
3
4
5
6
7
8
WITH 
   MEMBER [Measures].[Label] AS [Geography].[Country].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Geography].[Country].CURRENTMEMBER.UNIQUENAME 
SELECT {[Measures].[Label], 
        [Measures].[UniqueName]
        } ON COLUMNS , 
      [Geography].[Country].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

If you run this, you get the following output:

This may be what you want, but note that the first row of the result set contains the ‘ALL’ member, which you probably don’t want. You can remove this by altering the ROWS clause to exclude them. This is simply a case of repeating the hierarchy name, e.g. [Geography].[Country].[Country].ALLMEMBERS

1
2
3
4
5
6
7
8
WITH 
   MEMBER [Measures].[Label] AS [Geography].[Country].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Geography].[Country].CURRENTMEMBER.UNIQUENAME 
SELECT {[Measures].[Label], 
        [Measures].[UniqueName]
        } ON COLUMNS , 
      [Geography].[Country].[Country].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

Hierarchies

With simple dimension attributes, you get a flat list of values. With hierarchies, whether standard user hierarchies, ragged or parent-child, you need to return a multi-level list. To do this we need to know what level each member is at. We can query this using the LEVEL.ORDINAL property. Adding this into the query, and replacing Geography for Employee, we get the following query:

1
2
3
4
5
6
7
8
9
10
WITH 
   MEMBER [Measures].[Label] AS [Employee].[Employees].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Employee].[Employees].CURRENTMEMBER.UNIQUENAME 
   MEMBER [Measures].[Ordinal] AS [Employee].[Employees].CURRENTMEMBER.LEVEL.ORDINAL 
SELECT {[Measures].[Label], 
        [Measures].[UniqueName], 
        [Measures].[Ordinal]
        } ON COLUMNS , 
      [Employee].[Employees].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

With the following results:

You can then do all sorts of funky things using basic MDX navigation functions. e.g. returning the parent of each member, by simply adding …PARENT.UNIQUENAME as another column

1
2
3
4
5
6
7
8
9
10
11
12
WITH 
   MEMBER [Measures].[Label] AS [Employee].[Employees].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Employee].[Employees].CURRENTMEMBER.UNIQUENAME 
   MEMBER [Measures].[Ordinal] AS [Employee].[Employees].CURRENTMEMBER.LEVEL.ORDINAL 
   MEMBER [Measures].[Parent] AS [Employee].[Employees].PARENT.UNIQUENAME
SELECT {[Measures].[Label], 
        [Measures].[UniqueName], 
        [Measures].[Ordinal],
        [Measures].[Parent]
        } ON COLUMNS , 
      [Employee].[Employees].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

Now go forth and query those dimension members…
FrogBlog-Out

Find first order date in MDX

I had an interesting question the other day; “how do I find the date of the first order/activity for a given customer/employee/product etc in MDX”?

Well you’ll be pleased to know that it’s pretty simple. The sample code bellow will work with the AdventureWorks DW 2008R2 cube.

First of all we need to filter the date dimension to contain only those with orders. Then take the first item of the resulting set, then find its name. Simples.

Filter the date hierarchy to only those dates with a Reseller Order Count:

   FILTER([Date].[Date].[Date].MEMBERS, [Measures].[Reseller Order count])

Note that you can specify a more advanced filter, such as [Measures].[Reseller Order Count]>10, which would find the first date which had more than 10 orders.

Then find the first item:

   .ITEM(0)

Then find the its name:

   .NAME

Put this together in a query and you get:

   WITH MEMBER [Measures].[First Activity] AS
       FILTER([Date].[Date].[Date].MEMBERS
            , [Measures].[Reseller Order count]).ITEM(0).NAME
   SELECT {[Measures].[Reseller Order count]
         , [Measures].[First Activity]
       } ON 0,
      [Employee].[Employees].MEMBERS ON 1
   FROM [Adventure Works]

This returns the total reseller order count for each employee, along with the date of their first reseller order. Note that the Employee hierarchy here is a parent child hierarchy. The calculated member [First Activity] will aggregate and calculate correctly with any dimension or member specified on the 1 (ROWS) axis, be it a parent-child, single attribute, normal hierarchy, etc. and will always find the first order date for the current member.
You should get the following results:

Frog-Blog Out.

Debug MDX queries using Drillthrough in SSMS

One of the great features of using Excel to browse an SSAS OLAP cube is the drillthrough ability. If you double click on any cell of an OLAP pivot table, Excel will create a new worksheet containing the top 1000 fact records that went to make up the figure in the selected cell.

N.B. The limit of 1000 rows can be altered, as per one of my previous blog posts here.

This feature is pretty well known, but not many folk realise how easy it is to reproduce this in SQL Server Management Studio (SSMS). All you need to do is prefix your query with DRILLTHROUGH.

i.e. Assuming an MDX query of

SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

Which returns the following results…

A query of

DRILLTHROUGH
SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

Returns the records contributing to the total figure. Great for diagnosing problems with an MDX query.

By default, only the first 10,000 rows are returned, but you can override this using MAXROWS

DRILLTHROUGH MAXROWS 500
SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

The columns that are returned are those defined in the Actions tab of the Cube Designer in BIDS (The Business Intelligence Development Studio).

If no action is defined, then the fact measures will be returned along with the keys that link to each relevant dimension, which tend not to be that helpful.

You can override the returned columns by using the RETURN clause

DRILLTHROUGH SELECT [Measures].[Internet Sales Amount] ON 0 FROM [Adventure Works] WHERE [Date].[January 1, 2004] RETURN [$Internet Sales Order Details].[Internet Sales Order] ,[$Sales Territory].[Sales Territory Region] ,NAME([$Product].[Product]) ,KEY([$Product].[Product]) ,UNIQUENAME([$Product].[Product]) ,[Internet Sales].[Internet Sales Amount] ,[Internet Sales].[Internet Order Quantity]


Note that there are some restrictions on what you can drill through

  • You can’t drill through an expression/calculation, only a raw measure
  • The MDX query needs to return a single cell (otherwise the cube does not know which one to drill through)
  • The data returned will be at the lowest granularity of the cube’s fact table

To explain the last point further, the cube does not return the raw data from the underlying data warehouse, but a summary of the facts grouped by unique combination of the relevant dimensions. i.e. if a warehouse table containing individual sales (by date, product, customer & store) is brought into a cube as a fact table that only has relationships with the date and product dimensions, then the cube drill through will return unique combinations of date and product, summarising sales for each combination. Extra granularity which the warehouse may contain (customer and store) will not be available.

Note that if you specify the RETURN columns, the rows are still returned at the lowest level of the fact table granularity, even if not all of the dimensions are brought out as columns. This may result in returning multiple identical records. Don’t worry, these will be distinct facts, just differentiated by a dimension/attribute that isn’t being returned.

You can find out more on TechNet here

Frog-Blog Out

Deploying MDX calculation scripts with xmla

If you’re a Business Intelligence developer I assume you have BIDS Helper installed. If not then stop reading this post and go and install it. Now. It adds a number of very useful features to the Business Intelligence Development Studio which provide help with many aspects of SSIS, SSRS and SSAS development.

One of my favourite utilities is the Deploy MDX Script function. This takes the calculation script for an SSAS cube (named sets, calculated measures, scope logic, etc.) and deploys it in isolation without having to redeploy and rebuild the entire cube. This is a life saver when trying to write and test complex MDX calculations, and has saved me days if not weeks of waiting around.

The Deploy MDX Script button works perfectly when deploying updated script to the development environment, but what if you want to deploy the same script changes to a testing or live environment? Is there a way of scripting the change without redeploying the entire cube?

Yes there is, by using the following xmla script. Just change the DatabaseID and CubeID elements of the Object element to point to your Analysis Services database and cube, and paste your MDX calculation script in between the <Text> and </Text> tags. Run the script in SQL Server Management Studio and it should update the cube with the new script.

This script works for SQL Server 2008 and SQL Server 2008 R2.

<Alter AllowCreate="true" ObjectExpansion="ExpandFull" xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" xmlns:as="http://schemas.microsoft.com/analysisservices/2003/engine"> <Object> <DatabaseID>YourDatabaseName</DatabaseID> <CubeID>YourCubeName</CubeID> <MdxScriptID>MdxScript</MdxScriptID> </Object> <ObjectDefinition> <MdxScript> <ID>MdxScript</ID> <Name>MdxScript</Name> <Commands xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" xmlns:dwd="http://schemas.microsoft.com/DataWarehouse/Designer/1.0"> <Command> <Text> /* The CALCULATE command controls the aggregation of leaf cells in the cube. If the CALCULATE command is deleted or modified, the data within the cube is affected. You should edit this command only if you manually specify how the cube is aggregated. */ CALCULATE; ------------------------------------------------ --Paste your MDX Calculations here ------------------------------------------------ </Text> </Command> </Commands> </MdxScript> </ObjectDefinition> </Alter>

And there you have it, you can update your MDX calculated members outside of BIDS without doing a full deploy.

Frog-Blog Out

Will DAX replace MDX

For those that haven’t yet heard of DAX, it’s an expression language developed by Microsoft to perform calculations against PowerPivot. Stepping back one step further, PowerPivot is essentially a local Analysis Services cube that runs within Excel 2010.

I’ve heard plenty of comments from various sources about how DAX is the [multi-dimensional] query language of the future and how it’s going to kill off MDX. Ok…. well no, it’s not ok.

For starters they both exist for two different purposes. DAX is not a query language but an expression language. You can use MDX to query information from a cube and generate a pivot, you can’t with DAX as it is not a query language.

The best way to think of DAX is as an extension to Excel formulas, you can use it to perform a calculation against an existing set of cells, in this case, a PowerPivot dataset.

It is also similar to an MDX calculated member, and in fact supports a number of MDX functions (TotalYTD, ParallelPeriod etc.).

If your data is in a database: You would use SQL to query data from a database and import the results into Excel. You would then use Excel expressions/calculations to enhance the data.

If your data is in a cube: You would use an Excel pivot (or MDX query) to query the data and import the results into Excel. You then have to use a third party tool such as OLAP PivotTable Extensions to add expressions/calculations to enhance the data.

If your data is in PowerPivot: You would use PowerPivot to query the data and import the results into Excel. You would then use DAX to add calculations to enhance the data.

DAX is a fantastic expression tool, and one that provides significant power to PowerPivot, but no, it won’t replace MDX. My hope is that Microsoft will provide DAX capability for MDX queries as well, and not restrict it to PowerPivot queries. As I’ve shown in my previous blog post it’s a great expression language that would provide significant benefit to cube users.

Calculate Run Rate (Full Year Projection) in MDX

This post explains how to create an MDX calculated member that will take a value from the cube and project it forward to the end of the year. This provides a simple mechanism for calculating what your expected total will be at year end, based upon current performance.

To do this more accurately you should use time series data mining models in SSAS and use DMX expressions to query the results, but this method is very simple and requires little effort, and will be pretty accurate so long as the data you’re modelling is fairly linear. Please note though that the more cyclical and seasonal your data is the less effective this will be.

The basic idea is that we take what we have done so far (i.e. year to date sales), look at how far through the year we are, and extrapolate the value of future months (or days/weeks/etc.) based upon values so far.

i.e. If we’re at March month end and we’ve sold 100 widgets so far this year, we’re 1/4 of the way through the year so we multiply 100 by 4 and get a prejected yearly total of 400.


This chart shows the concept of what we’re doing, and shows the full year prejections calculated in March (with 3 months of available data) and June (6 months of data). The projections obviously get more accurate the further you are through the year.

One of the points to note is that when creating a calculation like this, based upon a time dimension, the calculation should always work with any level of the dimension hierarchy selected. i.e. The user shouldn’t care whether they’re looking at a month, week, quarter or a day, the calculation should always work the same. To achieve this we simply use the .currentmember of the time hierarchy.

The following examples are based upon projecting the Internet Sales Amount measure found within the SQL Server 2008 Adventure Works DW sample cube.

Step 1 – What are our total sales so far this year?

MDX helpfully provides us with the YTD function which takes care of this for us.


  MEMBER [Measures].[YTD Sales] AS
    AGGREGATE(
      YTD([Date].[Calendar].CurrentMember)
      ,[Measures].[Internet Sales Amount])

This takes the current member of the Calendar hierarchy, and creates a set of all dates before it (this year) using YTD. It then aggregates (in this case sums) the Internet Sales Amount for all of these dates to calculate YTD Sales.

Step 2 – Which period are we in?

Here we’ll use the same YTD function to create a set of all dates so far this year, but in this case we’ll count the number of resulting members. Note that because we’re using the .CurrentMember of the hierarchy, it doesn’t matter if we’re looking at a date, week or month, the MDX will work. i.e. If we’re looking at 21 Jan it will return 21. If we’re looking at Q3 it will return 3, August will return 8 etc.


  MEMBER [Measures].[CurPeriod] AS
    COUNT(
      YTD([Date].[Calendar].CurrentMember)
      ,INCLUDEEMPTY)

Step 3 – How many periods are in the year?

If we coded this to only work with months then we could hard code this to 12 however we need to keep it generic to all levels of the hierarchy. So, we have to count all the cousins of the current time member [within this year]. Unfortunately there isn’t a Cousins function in MDX, and Siblings will only return other members within the same parent. i.e. siblings of May 4th would include May 1 through to May 31. To get around this we find the year of the current member by using the Ancestor function.


  ANCESTOR([Date].[Calendar].CurrentMember
  , [Date].[Calendar].[Calendar Year])

Then we find all of the descendants of the year, at the same level of the hierarchy (week/day/etc.) as the current member. We can then take a count as before.


  MEMBER [Measures].[TotalPeriods] AS
    COUNT(
      DESCENDANTS(
        ANCESTOR([Date].[Calendar].CurrentMember
          ,[Date].[Calendar].[Calendar Year])
        ,[Date].[Calendar].CurrentMember.level)
      ,INCLUDEEMPTY)

Step 4 – Calculate the Run Rate

Calculating the prejected yearly total (run rate) is then a simple calculation


  MEMBER [Measures].[Full Year Run Rate] AS
    [Measures].[YTD Sales]
    * ([Measures].[TotalPeriods]
       /[Measures].[CurPeriod])

You can then put the whole lot together and see the results…


WITH

  MEMBER [Measures].[YTD Sales] AS
    AGGREGATE(
      YTD([Date].[Calendar].CurrentMember)
      ,[Measures].[Internet Sales Amount])

  MEMBER [Measures].[CurPeriod] AS
    COUNT(
      YTD([Date].[Calendar].CurrentMember)
      ,INCLUDEEMPTY)

  MEMBER [Measures].[TotalPeriods] AS
    COUNT(
      DESCENDANTS(
        ANCESTOR([Date].[Calendar].CurrentMember
          ,[Date].[Calendar].[Calendar Year])
        ,[Date].[Calendar].CurrentMember.level)
      ,INCLUDEEMPTY)

  MEMBER [Measures].[Full Year Run Rate] AS
    [Measures].[YTD Sales]
    * ([Measures].[TotalPeriods]
       /[Measures].[CurPeriod])

SELECT
{
     [Measures].[Internet Sales Amount]
    ,[Measures].[YTD Sales]
    ,[Measures].[Full Year Run Rate]
    ,[Measures].[CurPeriod]
    ,[Measures].[TotalPeriods]
} ON 0,
{
    DESCENDANTS([Date].[Calendar].[CY 2003])
} ON 1
FROM [Direct Sales]

In my next blog I’ll be diong the same calculation in DAX for use with PowerPivot, stay tuned…

Frog-Blog Out

The Frog Blog

I'm Alex Whittles.

I specialise in designing and implementing SQL Server business intelligence solutions, and this is my blog! Just a collection of thoughts, techniques and ramblings on SQL Server, Cubes, Data Warehouses, MDX, DAX and whatever else comes to mind.

Data Platform MVP

Frog Blog Out
twitter
rssicon