One of the most frustrating limitations of SQL Server 2005 Standard edition is that it doesn’t support semi additive measures in SSAS Analysis Services cubes. This post explains a work around that provides similar functionality without having to shell out for the Enterprise Edition.
What Are Semi Additive Measures?
Semi Additive measures are values that you can summarise across any related dimension except time.
For example, Sales and costs are fully additive; if you sell 100 yesterday and 50 today then you’ve sold 150 in total. You can add them up over time.
Stock levels however are semi additive; if you had 100 in stock yesterday, and 50 in stock today, you’re total stock is 50, not 150. It doesn’t make sense to add up the measures over time, you need to find the most recent value.
Why are they important?
Whether they are important to you or not depends entirely on what you are trying to do with your cube. If all of your required measures are fully additive then you really don’t need to worry about anything. However as soon as you want to include measures such as stock levels, salarys, share prices or test results then they become pretty much essential.
Why are they not available in SQL Standard edition?
Microsoft has to have some way of pursuading us to pay for the Enterprise edition!
How can I get this functionality within SQL Standard?
Firstly we need to understand what semi additive measures do. By far the most common aggregation used is the LastNonEmpty function, so we’ll stick with that as an example. This basically says that whatever time frame you are looking at, find the most recent value for each tuple. This really is a fantastically powerful function, which only really becomes apparent whan you don’t have it!
Lets say that you perform a stock take of different products on different days of the week. You will have a stock entry for product A on a Thursday and product B on a Friday. The LastNonEmpty function takes care of this for you, if you look at the stock level on Saturday it will give you the correct values for both A and B, even though you didn’t perform a physical stock take on the Saturday.
If you then add the time dimension into the query, SSAS will perform this function for each and every time attribute shown, and then aggregate the results up to any other dimensions used. i.e. Each month will then display the sum of all LastNonEmpty values for all products within that month, essentially the closing stock level for each and every month.
To replicate this in Standard Edition, we need to split the work up into two stages.
1) Create daily values in the data warehouse
2) Use MDX to select a single value from the time dimension.
Think of this as splitting up the LastNonEmpty function into two, ‘Last’ and ‘Non Empty’. The ‘Non Empty’ bit essentially fills in the blanks for us. If a value doesn’t exist for that particular day, it looks at the previous day’s value. The ‘Last’ bit says that if we are looking at months in our query, find the value for the last day in that month. The same goes for years, or indeed any other time attribute.
To code up a full LastNonEmpty function ourselves in MDX would be too slow to query as soon as you get a cube of any reasonable size. One of the key benefits of a cube is speed of querying data and we don’t want to impact this too much, therefore we move some of the donkey work into the ETL process populating the datawarehouse. This leaves the cube to perform a simple enough calculation so as to not cause any problems.
1) The ‘Non Empty’ bit
Lets say that have a table called tblStock, containing the following data
We need to expand this into a new fact table that contains one record per day per product.
There are a number of ways of doing this, I’ll describe one here that should suit most situations, although you may need to customise it to your own situation, and limit it to only updating changed/new records rather than re-populating the entire table, but you get the idea. I should point out that you would be much better off populating this as part of your ETL process, but I’m showing this method as it’s more generic.
You need a list of all available dates relevant to your data warehouse or cube. If you already have a time dimension table then use this, otherwise create a SQL function that returns you a list of dates, such as this one:
CREATE FUNCTION [dbo].[FN_ReturnAllDates](
@DateFrom DateTime, @DateTo DateTime)
RETURNS @List TABLE (Date DateTime)
BEGIN
DECLARE @tmpDate DateTime
SET @tmpDate = @DateFrom
WHILE @tmpDate<=@DateTo
BEGIN
INSERT INTO @List
SELECT Convert(datetime,
Convert(Nvarchar,@tmpDate, 102), 102)
SET @tmpDate = Dateadd(d,1,@tmpDate)
END
RETURN
END
We need to perform a full outer join between the date dimension and any other relevant dimensions, in this case product. This will generate one record per product per date. We can then perform a sub query for each combination to find the stock level appropriate for that day. (Yes, this will be a slow query to run – I did say you should do it in your ETL process!)
INSERT INTO FactStock
(StockTakeDate, ProductID, StockLevel)
SELECT D.Date, P.ProductID,
ISNULL((SELECT TOP 1 StockLevel
FROM tblStock
WHERE ProductID = P.ProductID
AND StockTakeDate<=D.Date
ORDER BY StockTakeDate DESC),0)
FROM FN_ReturnAllDates((SELECT Min(StockTakeDate)
FROM tblStock),GetDate()) D
FULL OUTER JOIN
(SELECT ProductID FROM tblProduct) P ON 1=1
2) The ‘Last’ bit
Now that we have a large fact table consisting of one record per product/date, we can load this into the cube.
If you just add the StockLevel field as a measure and browse the results, you’ll quickly see that if you view it by month, you will get each day’s stock level added together giving you a non-sensical value. To fix this we need to tell Analysis Services to only show one day’s value.
To do this we first need to find all descendents of the current time member at the day level, using something like this:
DESCENDANTS([Time].[Year Month Day].CurrentMember,
[Time].[Year Month Day].[Day])
--Please modify to suit your own date hierarchy!
We can then find the last member (giving us the closing stock level) by using TAIL():
TAIL(DESCENDANTS([Time].[Year Month Day].CurrentMember,
[Time].[Year Month Day].[Day]))
You could aso use HEAD() if you wanted to find the opening stock instead of closing.
You should hide the actual StockLevel measure to prevent users from selecting it, I usually alias these with an underscore, as well as making them invisible, just for clarity. You can then add a calculated member with the following MDX:
CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Close]
AS SUM(TAIL(DESCENDANTS([Time].[Year Month Day].currentmember,
[Time].[Year Month Day].[Day])),
[Measures].[_Stock Level]),
FORMAT_STRING = "#,#",
VISIBLE = 1 ;
Or you can calculate the average stock over the selected period
CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Avg]
AS AVG(DESCENDANTS([Time].[Year Month Day].currentmember,
[Time].[Year Month Day].[Day]),
[Measures].[_Stock Level]),
FORMAT_STRING = "#,#",
VISIBLE = 1 ;
Or the maximum value
CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Max]
AS MAX(DESCENDANTS([Time].[Year Month Day].currentmember,
[Time].[Year Month Day].[Day]),
[Measures].[_Stock Level]),
FORMAT_STRING = "#,#",
VISIBLE = 1 ;
Or the mimimum value
CREATE MEMBER CURRENTCUBE.[MEASURES].[Stock Level Min]
AS MIN(DESCENDANTS([Time].[Year Month Day].currentmember,
[Time].[Year Month Day].[Day]),
[Measures].[_Stock Level]),
FORMAT_STRING = "#,#",
VISIBLE = 1 ;
And there you have it, semi additive measures in SQL Server 2005 Standard Edition!
Even though this method does work well, it is still not as good as having the Enterprise edition. The built in functions of Enterprise will perform significantly better than this method, and it saves having to create the large (potentially huge) fact table. This process will also only work on a single date hierarchy. If you have multiple hierarchies (i.e. fiscal and calendar) you will need to enhance this somewhat.
Fantastic post!!!
Cheers!
Pedro
i got 16133 different products and 48 locations, so the combination for one day are 774384 rows (one product for on location), then one year are 282650160 rows. It takes very long time to load the data even with ETL. Any ideas??
Hi Daniel
The first thing I’d do is assess whether you need daily data for this, or would weekly or monthly data be sufficient?
I would recommend keeping daily data for 1 week, weekly data for the previous 7 weeks, and then monthly data for all previous history.
All MDX calculations listed in the post should still work without modification, they will just find the most recent value stored for any one given day. The MDX will figure out whether it has to go back 1 day or 1 month.
You can take care of this entirely in the ETL. Firstly delete any old records that aren’t needed (either using a lookup table of dates to keep, or an appropriate algorythm based on the date), then load the new records for the latest date.
You’re not going to get around the problem of loading 774384 records every night, but if the data flow is designed correctly this really shouldn’t cause any performance problems at all, SSIS is designed to copy with millions of records.
Hope this helps, please let me know iif you need any more info.
Alex
Where can I find Microsoft documentation that says semi-additive measures are not supported in SQL Server 2005 Standard edition?
I checked the SQL Server 2005 Features Comparison page below and did not see it.
http://www.microsoft.com/Sqlserver/2005/en/us/compare-features.aspx#busintell
Thanks!
My bad! Upon closer inspection, it is listed there! Thanks for your post!
As far as I’ve been testing the built in functions of Enterprise need to create the large (potentially huge) fact tables.
ej. Adventure Works FactCurrencyRate Table.
Anyone can explain me the real benefits of the semiadditives of the enterprise edition?
Does really microsoft knows anything about the real world?
Thanks, this is a great example.
Would you just use an OLE DB Command to run/build the Date Dimension and a Merge Join transform to do this in the Data flow for ETL? Is there a better transform to use to build the Date List &/or do the outer join? I can’t see a way to use a DB function within a Join. Is that what I should be trying to do?
I’m just trying to teach myself this but all the other examples I find seem to assume I have bought Enterprise.
Is it possible to use something like a SCD to reduce the number of fact rows?
Thanks!
Thanks for the good post.
I have used the method and have a question:
We have a cube were daily values are loaded. There is also a time hierarchy Year > Month > Day. Everything works fine except for the case when You for example choose 2009 (lets say that the current date is 2009-06-08), then the return will be ” since the whole of 2009 isn’t calculate. How would one do to have the last value e.g. 2009-06-08 then. (I guess filling the whole year would be one solution)
Hi Reza
You can get the last member of any year in your hierarchy by using the LastChild function. If you ask for the LastChild of a specific year, it will either return 31 December (for previous years) or the most recent date (for the current year). i.e.
[Time].[Year].&[2009].LastChild.LastChild
This will find the most recent day from the most recent month in your Time hierarchy from the 2009 year.
Hope this helps
Alex
Very, very useful post. Thanks!
[…] few years ago, Alex Whittles wrote a very detailed explanation about semi additive measures in a SQL Server Standard Edition. For my post today I’m […]
[…] few years ago, Alex Whittles wrote a very detailed explanation about semi additive measures in a SQL Server Standard Edition. For my post today I’m […]