0845 643 64 63

Analysis Services

MDX Sub select Vs WHERE clause

I’ve just read an interesting thread on the SQL Server Developer Center forum, regarding how to filter results. Specifically the difference in MDX between using a subselect

SELECT x on COLUMNS, y on ROWS FROM ( SELECT z on COLUMNS FROM cube))

or using a where clause

SELECT x on COLUMNS, y on ROWS FROM cube WHERE z

In a simple query they produce the same results, but what is the actual difference? You can read the full thread here, but to summarise Darren Gosbell’s response…

Using the WHERE clause sets the query context and consequently the CurrentMember. This then enables functions such as YTD and PerdiodsToDate to work.

Using a subselect can provide improved performance, but does not set the context.

Simples..!

Excel 2007 and SSAS 2008 Error

I was working on a new SSAS 2008 cube today, and came across an error which Google was unable to help with. I thought I’d post the solution here to help anyone else who may encounter it.

The cube in question will be primarily be accessed using Excel 2007, so I’d been dutifully testing it along the way to ensure all was well. And then, after a number of changes the following error appeared when connecting to the cube from Excel to create a pivot table.

Excel was unable to get necessary information about this cube. The cube might have been reorganized or changed on the server.

Contact the OLAP cube administrator and, if necessary, set up a new data source to connect to the cube

Connecting and querying the cube via SSMS or BIDS worked without error (hense I didn’t spot the error sooner!).

A quick Google revealed a number of posts regarding this error, but they all related to attributes containing invalid characters when accessed from Excel 2000 Or problems with translations and locale settings in the .oqy file. Neither of these was the cause here, so I had to go back and recreate every change I had made step by step to track the problem.

Well, I’m please to report that in the end it was nothing more that a simple spelling mistake in a named set. One of the dynamic named sets in the cube calculations referred to a specific member of a dimension, which was spelled slightly incorrectly. (Simplified example..)

CREATE DYNAMIC SET CURRENTCUBE.[Set1]
 AS {[Dimension].[Attribute].[Value1],
     [Dimension].[Attribute].[Value2WithTypo]
    };

When querying calculated measures through MDX in SSMS, the MDX parser just ignored the problem and only uses the valid members, however it appears as though Excel 2007 is slightly more picky with its cubes.

Useful to know, and even more useful when used as a tool to double check for any errors in the MDX calculations.

Mosha's MDX Studio

I almost feel embarrassed…, I’ve been writing this blog for over 9 months now, and I have yet to mention Mosha, although in my defence, there is a link to his blog in the links section to the right.

As many/most of you may know, Mosha Pasumansky is one the key brains behind designing the MDX language and Analysis Services – nuff said?

Over the last year he has been working on a pet project, MDX Studio. It’s an MDX query tool which any self respecting OLAP developer should now be using on a regular basis. He has just released v0.4.6, which adds some really nifty features such as the dependency view.

If you’re just starting out with MDX, then the intellisense will be of massive benefit to you; even if you’re a seasoned pro, the performance monitoring is an essential tool on its own.

If you haven’t already tried it, have a look at Mosha’s blog, and get a copy – you won’t regret it.

And thanks for all your hard work Mosha – It’s much appreciated.

Alex

Ranking results from MDX queries

This post explains how you can create a ranking of data from an OLAP MDX query. This will take the results from the query, and assign a ranking to each row. i.e. 1st, 2nd, 3rd best rows etc.

The first thing to do is to decide two things.
1) What measure do you want to rank by
2) What data set are you returning

Let’s assume we want to rank all stores by sales value. The basic non-ranked MDX query would be something like this

  SELECT
    {[Measures].[Sales Value]} ON 0,
    {[Store].[Store Name].members} ON 1
  FROM
    [SalesCube]

So our measure is Sales Value, and our data set (granularity) is Store Name. We now want to create an ordered set of this data, ordered by Sales Value. We do this with the ORDER() function, which takes a set, a measure and either ascending or descending as its parameters. Note that by specifying the attribute twice we remove the [All] member from the set.

  WITH SET [OrderedSet] AS
    ORDER([Store].[Store Name].[Store Name].MEMBERS,
	  [Measures].[Sales Value],
	  BDESC)
  SELECT
    {[Measures].[Sales Value]} ON 0,
    {[OrderedSet]} ON 1
  FROM
    [SalesCube]

The next stage is to apply a ranking to this ordered set. Helpfully, MDX provides us with a Rank() function, which takes a member and a set as its parameters. All it does is locate the member within the set and return its position. Because we have ordered the set, it will give us the ranking.

  WITH SET [OrderedSet] AS
    ORDER([Store].[Store Name].[Store Name].MEMBERS,
	  [Measures].[Sales Value],
	  BDESC)
  MEMBER [Measures].[Rank] AS
    RANK([Store].[Store Name].CurrentMember,
	 [OrderedSet])
  SELECT
    {[Measures].[Rank], [Measures].[Sales Value]} ON 0,
    {[OrderedSet]} ON 1
  FROM
    [SalesCube]

You can now easily expand this to only show you the top x records, by using the Head() function on the ordered set. In this example we’re only showing the top 10. You could also use Tail() to find the bottom x records. Other functions you can use include TopPercent(), TopSum(), BottomPercent() and BottomSum().

  WITH SET [OrderedSet] AS
    ORDER([Store].[Store Name].[Store Name].MEMBERS,
	  [Measures].[Sales Value],
	  BDESC)
  MEMBER [Measures].[Rank] AS
    RANK([Store].[Store Name].CurrentMember,
	 [OrderedSet])
  SELECT
    {[Measures].[Rank], [Measures].[Sales Value]} ON 0,
    {HEAD([OrderedSet], 10)} ON 1
  FROM
    [SalesCube]

Semi Additive Measures using SQL Server Standard

One of the most frustrating limitations of SQL Server 2005 Standard edition is that it doesn’t support semi additive measures in SSAS Analysis Services cubes. This post explains a work around that provides similar functionality without having to shell out for the Enterprise Edition.

What Are Semi Additive Measures?

Semi Additive measures are values that you can summarise across any related dimension except time.

For example, Sales and costs are fully additive; if you sell 100 yesterday and 50 today then you’ve sold 150 in total. You can add them up over time.

Stock levels however are semi additive; if you had 100 in stock yesterday, and 50 in stock today, you’re total stock is 50, not 150. It doesn’t make sense to add up the measures over time, you need to find the most recent value.

Why are they important?

Whether they are important to you or not depends entirely on what you are trying to do with your cube. If all of your required measures are fully additive then you really don’t need to worry about anything. However as soon as you want to include measures such as stock levels, salarys, share prices or test results then they become pretty much essential.

Why are they not available in SQL Standard edition?

Microsoft has to have some way of pursuading us to pay for the Enterprise edition!

How can I get this functionality within SQL Standard?

Firstly we need to understand what semi additive measures do. By far the most common aggregation used is the LastNonEmpty function, so we’ll stick with that as an example. This basically says that whatever time frame you are looking at, find the most recent value for each tuple. This really is a fantastically powerful function, which only really becomes apparent whan you don’t have it!

Lets say that you perform a stock take of different products on different days of the week. You will have a stock entry for product A on a Thursday and product B on a Friday. The LastNonEmpty function takes care of this for you, if you look at the stock level on Saturday it will give you the correct values for both A and B, even though you didn’t perform a physical stock take on the Saturday.

If you then add the time dimension into the query, SSAS will perform this function for each and every time attribute shown, and then aggregate the results up to any other dimensions used. i.e. Each month will then display the sum of all LastNonEmpty values for all products within that month, essentially the closing stock level for each and every month.

To replicate this in Standard Edition, we need to split the work up into two stages.
1) Create daily values in the data warehouse
2) Use MDX to select a single value from the time dimension.

Think of this as splitting up the LastNonEmpty function into two, ‘Last’ and ‘Non Empty’. The ‘Non Empty’ bit essentially fills in the blanks for us. If a value doesn’t exist for that particular day, it looks at the previous day’s value. The ‘Last’ bit says that if we are looking at months in our query, find the value for the last day in that month. The same goes for years, or indeed any other time attribute.

To code up a full LastNonEmpty function ourselves in MDX would be too slow to query as soon as you get a cube of any reasonable size. One of the key benefits of a cube is speed of querying data and we don’t want to impact this too much, therefore we move some of the donkey work into the ETL process populating the datawarehouse. This leaves the cube to perform a simple enough calculation so as to not cause any problems.

1) The ‘Non Empty’ bit

Lets say that have a table called tblStock, containing the following data

We need to expand this into a new fact table that contains one record per day per product.

There are a number of ways of doing this, I’ll describe one here that should suit most situations, although you may need to customise it to your own situation, and limit it to only updating changed/new records rather than re-populating the entire table, but you get the idea. I should point out that you would be much better off populating this as part of your ETL process, but I’m showing this method as it’s more generic.

You need a list of all available dates relevant to your data warehouse or cube. If you already have a time dimension table then use this, otherwise create a SQL function that returns you a list of dates, such as this one:


   CREATE FUNCTION [dbo].[FN_ReturnAllDates](
         @DateFrom DateTime, @DateTo DateTime)
         RETURNS @List TABLE (Date DateTime)
     BEGIN
     DECLARE @tmpDate DateTime
     SET @tmpDate = @DateFrom
     WHILE @tmpDate<=@DateTo
       BEGIN
         INSERT INTO @List
           SELECT Convert(datetime,
                 Convert(Nvarchar,@tmpDate, 102), 102)
         SET @tmpDate = Dateadd(d,1,@tmpDate)
       END
     RETURN
   END

We need to perform a full outer join between the date dimension and any other relevant dimensions, in this case product. This will generate one record per product per date. We can then perform a sub query for each combination to find the stock level appropriate for that day. (Yes, this will be a slow query to run – I did say you should do it in your ETL process!)


     INSERT INTO FactStock
        (StockTakeDate, ProductID, StockLevel)
     SELECT D.Date, P.ProductID,
           ISNULL((SELECT TOP 1 StockLevel
              FROM tblStock
              WHERE ProductID = P.ProductID
                 AND StockTakeDate<=D.Date
              ORDER BY StockTakeDate DESC),0)
        FROM FN_ReturnAllDates((SELECT Min(StockTakeDate)
                      FROM tblStock),GetDate()) D
           FULL OUTER JOIN
                    (SELECT ProductID FROM tblProduct) P ON 1=1

2) The ‘Last’ bit

Now that we have a large fact table consisting of one record per product/date, we can load this into the cube.

If you just add the StockLevel field as a measure and browse the results, you’ll quickly see that if you view it by month, you will get each day’s stock level added together giving you a non-sensical value. To fix this we need to tell Analysis Services to only show one day’s value.

To do this we first need to find all descendents of the current time member at the day level, using something like this:

     DESCENDANTS([Time].[Year Month Day].CurrentMember,
       [Time].[Year Month Day].[Day])
       --Please modify to suit your own date hierarchy! 

We can then find the last member (giving us the closing stock level) by using TAIL():

     TAIL(DESCENDANTS([Time].[Year Month Day].CurrentMember,
          [Time].[Year Month Day].[Day]))

You could aso use HEAD() if you wanted to find the opening stock instead of closing.

You should hide the actual StockLevel measure to prevent users from selecting it, I usually alias these with an underscore, as well as making them invisible, just for clarity. You can then add a calculated member with the following MDX:


     CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Close]
      AS SUM(TAIL(DESCENDANTS([Time].[Year Month Day].currentmember,
                    [Time].[Year Month Day].[Day])),
                    [Measures].[_Stock Level]),
     FORMAT_STRING = "#,#",
     VISIBLE = 1  ;

Or you can calculate the average stock over the selected period


     CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Avg]
      AS AVG(DESCENDANTS([Time].[Year Month Day].currentmember,
                   [Time].[Year Month Day].[Day]),
                   [Measures].[_Stock Level]),
     FORMAT_STRING = "#,#",
     VISIBLE = 1  ;

Or the maximum value


     CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Max]
      AS MAX(DESCENDANTS([Time].[Year Month Day].currentmember,
                   [Time].[Year Month Day].[Day]),
                   [Measures].[_Stock Level]),
     FORMAT_STRING = "#,#",
     VISIBLE = 1  ;

Or the mimimum value


     CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Min]
      AS MIN(DESCENDANTS([Time].[Year Month Day].currentmember,
                  [Time].[Year Month Day].[Day]),
                  [Measures].[_Stock Level]),
     FORMAT_STRING = "#,#",
     VISIBLE = 1  ;

And there you have it, semi additive measures in SQL Server 2005 Standard Edition!

Even though this method does work well, it is still not as good as having the Enterprise edition. The built in functions of Enterprise will perform significantly better than this method, and it saves having to create the large (potentially huge) fact table. This process will also only work on a single date hierarchy. If you have multiple hierarchies (i.e. fiscal and calendar) you will need to enhance this somewhat.

Excel Addin for Analysis Services

For any users of Analysis Services, if you haven’t already downloaded the Excel (2002/2003) addin you’re missing out.

It’s a free download from Microsoft which significantly expands Excel’s cube querying ability. Well recommended!

Get it here…

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Lewis Prince
Reiss McSporran
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon