The Cloud is pretty much top of the list of the corporate buzzword bingo phrases at the moment. It's clear what the benefits are from a SAAS perspective, but it's been less clear how the Business Intelligence world will start to make use of it.
Data Warehousing, Management Information, Cubes, Data Mining etc. all involve a lot of data, usually gigabytes or terrabytes even for medium sized organisations. Internet connectivity (speed and reliability) is obviously going to be a major concern and limiting factor however we can assume that connectivity will continue to improve dramatically over the next few years.
Security is a major concern; when you combine all of your company's information into a single place (accounts, sales, customers, etc.) it raises a large question of how the security is managed when you let this information out of your firewalled LAN and host it on the internet.
It will of course bring a number of benefits incuding reduced infrastructure/capital costs, improved reliability and redundancy of hardware, increased accessibility for the mobile workforce etc. etc., and these should not be overlooked as they will no doubt provide a large incentive for some companies.
I've no doubt that the cloud is coming to BI, but I'd think carefully about what actual benefits it brings and how they trade off against any problems it may also bring.
There's an interesting webcast panel discussion about BI and Cloud Computing below, including (among others) Donald Farmer from Microsoft discussing some of the issues above. Read the rest of this entry »
This is the 3rd and final post in this series of blog posts, showing how you can use SQL Server Reporting Services (SSRS), DMVs and spatial data to generate real time automated user guide documentation for your Analysis Services (SSAS) OLAP cube.
- Part 1 – Creating the DMV stored procs
- Part 2 – Create the SSRS reports
- Part 3 – Use spatial data and maps to create a star schema view
- Download Source Code
UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.
In this post, I’m going to enhance the measure group report to include a visualisation of the star schema. To do this I’ll be enhancing one of the stored procedures to utilise SQL Server 2008′s new spatial data types combined with SSRS 2008 R2′s new map functionality.

To do this we’ll update the dbo.upCubeDocDimensionsForMeasureGroup stored proc so that it returns a SQL geometry polygon for each row, in the right place around the circumference of the star. There’s a little math in this, but nothing more than a bit of trigonometry.
First the theory. We have an arbitrary number of dimensions that we need to place in a circle around a central point (the measure group). If we have 6 dimensions, then we need to divide the whole circle (360 degrees) by 6 (=60 degrees each) to get the angle of each around the hypothetical axis.

Therefore the first dimension needs to be at 60, the second at 120, the third at 180 etc, with the 6th at 360, completing the full circle.
Obviously the angle needs to vary depending on the number of dimensions in the query, so we need to calculate it within the stored proc. To do this I’m using common table expressions (CTE) to perform further calculations on the basic query.
We wrap the original proc query into a CTE and call it BaseData. We also add an extra field called Seq, which uniquely identifies each row, we’ll use this later to enable us to rank the dimensions.
;WITH BaseData AS
(
SELECT
mgd.*
, d.[DESCRIPTION]
, REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
AS VARCHAR(255))
,'[',''),']','') AS DimensionCaption
, REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
AS VARCHAR(255))
,'[',''),']','') AS MeasureGroupCaption
FROM OPENQUERY(CubeLinkedServer, 'SELECT
[CATALOG_NAME]
+[CUBE_NAME]
+[MEASUREGROUP_NAME]
+[DIMENSION_UNIQUE_NAME] AS Seq
, [CATALOG_NAME]
, [CUBE_NAME]
, [MEASUREGROUP_NAME]
, [MEASUREGROUP_CARDINALITY]
, [DIMENSION_UNIQUE_NAME]
, [DIMENSION_CARDINALITY]
, [DIMENSION_IS_VISIBLE]
, [DIMENSION_IS_FACT_DIMENSION]
, [DIMENSION_GRANULARITY]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') mgd
INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[DIMENSION_UNIQUE_NAME]
,[DESCRIPTION]
FROM $SYSTEM.MDSCHEMA_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') d
ON CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
= CAST(d.[CATALOG_NAME] AS VARCHAR(255))
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
= CAST(d.[CUBE_NAME] AS VARCHAR(255))
AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
= CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
WHERE CAST(mgd.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255)) = @Cube
AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)
We’ll then add a new CTE which calculated the number of records returned by the previous query.
,TotCount AS
(
SELECT COUNT(*) AS RecCount FROM BaseData
)
Next we cross join TotCount with the base data, so that every row has the extra RecCount field. We then rank each record, providing each with a unique number from 1 to n.
, RecCount AS
(
SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
, RecCount
, BaseData.*
FROM
BaseData CROSS JOIN TotCount
)
Each record now contains its row number, as well as the total number of rows, so it’s easy to calculate its position around the circle (rank/n * 360). Now we have that, calculating the x and y coordinates of each dimension is simply a case of applying Sine and Cosine. Note that the SQL SIN and COS functions expect angles to be provided in radians not degrees, so we have to use the RADIANS function to convert it for us. I’m also multiplying the result by 1000 to scale the numbers up from -1 to +1 to -1000 to +1000, which makes our life easier later on.
, Angles AS
(
SELECT
*
, SIN(RADIANS((CAST(RecID AS FLOAT)
/CAST(RecCount AS FLOAT))
* 360)) * 1000 AS x
, COS(RADIANS((CAST(RecID AS FLOAT)
/CAST(RecCount AS FLOAT))
* 360)) * 1000 AS y
FROM RecCount
)
We can now use the x and y coordinates to create a point indicating the position of each dimension, using the code below.
geometry::STGeomFromText('POINT('
+ CAST(y AS VARCHAR(20))
+ ' '
+ CAST(x AS VARCHAR(20))
+ ')',4326) AS Posn
This is a good start, but we want a polygon box, not a single point. We can use a similar geometry function to create a polygon around our point.
geometry::STPolyFromText('POLYGON ((' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
))',0) AS Box
You’ll notice that I’m multiplying the y axis by a @Stretch variable. This is to allow us to squash or squeeze the resulting star to make it look better in the report. I’m also using a @BoxSize variable which we can use to change the relative size of the boxes. It’s for this reason why I like to work on a -1000 to +1000 scale, it means we can have an integer box size of say 250 instead of a fraction such as 0.25, I just think it’s easier to read.
So you’ll now have a stored proc similar to this.
CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
,@MeasureGroup VARCHAR(255)
)
AS
DECLARE @BoxSize INT
DECLARE @Stretch FLOAT
SET @BoxSize = 250
SET @Stretch = 1.4
;WITH BaseData AS
(
SELECT
mgd.*
, d.[DESCRIPTION]
, REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
AS VARCHAR(255))
,'[',''),']','') AS DimensionCaption
, REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
AS VARCHAR(255))
,'[',''),']','') AS MeasureGroupCaption
FROM OPENQUERY(CubeLinkedServer, 'SELECT
[CATALOG_NAME]
+[CUBE_NAME]
+[MEASUREGROUP_NAME]
+[DIMENSION_UNIQUE_NAME] AS Seq
, [CATALOG_NAME]
, [CUBE_NAME]
, [MEASUREGROUP_NAME]
, [MEASUREGROUP_CARDINALITY]
, [DIMENSION_UNIQUE_NAME]
, [DIMENSION_CARDINALITY]
, [DIMENSION_IS_VISIBLE]
, [DIMENSION_IS_FACT_DIMENSION]
, [DIMENSION_GRANULARITY]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') mgd
INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[DIMENSION_UNIQUE_NAME]
,[DESCRIPTION]
FROM $SYSTEM.MDSCHEMA_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') d
ON CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
= CAST(d.[CATALOG_NAME] AS VARCHAR(255))
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
= CAST(d.[CUBE_NAME] AS VARCHAR(255))
AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
= CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
WHERE CAST(mgd.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255)) = @Cube
AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)
,TotCount AS
(
SELECT COUNT(*) AS RecCount FROM BaseData
)
, RecCount AS
(
SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
, RecCount
, BaseData.*
FROM
BaseData CROSS JOIN TotCount
)
, Angles AS
(
SELECT
*
, SIN(RADIANS((CAST(RecID AS FLOAT)
/CAST(RecCount AS FLOAT))
* 360)) * 1000 AS x
, COS(RADIANS((CAST(RecID AS FLOAT)
/CAST(RecCount AS FLOAT))
* 360)) * 1000 AS y
FROM RecCount
)
,Results AS
(
SELECT
*
, geometry::STGeomFromText('POINT('
+ CAST(y AS VARCHAR(20))
+ ' '
+ CAST(x AS VARCHAR(20))
+ ')',4326) AS Posn
, geometry::STPolyFromText('POLYGON ((' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
))',0) AS Box
FROM Angles
)
SELECT * FROM Results
GO
If you then execute this in Management Studio, you’ll notice an extra tab in the result window called Spatial Results.
EXEC [dbo].[upCubeDocDimensionsForMeasureGroup]
@Catalog = 'Adventure Works DW 2008R2',
@Cube = 'Adventure Works',
@MeasureGroup = 'Financial Reporting'
Click on the Spatial Results tab, then select Box as the spatial column, and you’ll see the boxes that we’ve created in a preview window.

This is now getting somewhere close. But as well as the dimensions, we also want to show the measure group in the middle, as well as lines linking them together to actuallly create our star. We can do this by adding a couple more geometry functions to our query. We end up with the end of our proc looking like this.
,Results AS
(
SELECT
*
, geometry::STGeomFromText('POINT('
+ CAST(y AS VARCHAR(20))
+ ' '
+ CAST(x AS VARCHAR(20))
+ ')',4326) AS Posn
, geometry::STPolyFromText('POLYGON ((' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
))',0) AS Box
, geometry::STLineFromText('LINESTRING (0 0, '
+ CAST((y*@Stretch) AS VARCHAR(20))
+ ' ' + CAST(x AS VARCHAR(20))
+ ')', 0) AS Line
, geometry::STPolyFromText('POLYGON ((' +
CAST(0+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST(0-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST(0-@BoxSize AS VARCHAR(20)) + ' '
+ CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST(0+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
CAST(0+@BoxSize AS VARCHAR(20)) + ' '
+ CAST(0+(@BoxSize/2) AS VARCHAR(20)) + '
))',0) AS CenterBox
FROM Angles
)
SELECT * FROM Results
GO
So, we’ve now got the polygons and lines being generated by the proc, it’s now time to add them into the report and display them to our users.
Firstly open up the CubeDoc_MeasureGroup.rdl report, and go to the properties of the dsDimensions dataset. Click Refresh Fields and the new x, y, Posn, Box, Line and CenterBox fields should now be available.
Then drag a Map from the toolbox onto the report. This will start the map wizard.
Select SQL Server Spatial Query as the source for the data and click Next.
Choose dsDimensions as the dataset to use for the data and click Next.
Choose Box as the spatial field, and Polygon as the layer type. It may well give you an error, just ignore it.
Don’t select the embed map or Bing maps layer.

Click Next, then select a basic map, select the default options for the remaining wizard stages and you’ll end up with a map in your report.
If you preview the report at this stage you won’t see the polygons. This is because the map still thinks it’s a geographical map, and it is trying to draw our boxes as latitude and longitudes. We don’t want this, but want it to show them on our own scale. To fix this, just change the CoordinateSystem property of the map from Geographic to Planar.

You can then preview the report, which should show you something like this

It still doesn’t look like a star but we’ve still got a few more changes to make. We need to add a couple more layers to the map, the center box for the measure group and then the lines to link them all together.
Add a new polygon layer to the map and then set the layer data to use the CenterBox field from the dsDimensions dataset.


Repeat the above, but with a new line layer instead of a polygon layer. Set the layer data to use the Line field of dsDimensions.

To move the lines behind the boxes, just rearrange the order of the layers by using the blue arrows in the map layers window. We want the Lines layer to be at the bottom.

Set the PolygonTemplate.BackgroundColour property of the CenterBox layer to Khaki, and the set the same property of the dimension box layer to LightBlue.

Then set the PolygonTemplate.Label property of the CenterBox layer to the MeasureGroupCaption field, and set the ShowLabel property to True. If you don’t then SSRS will decide whether or not to show the label, we want it to always show.
Set the PolygonTemplate.Label property of the Dimension layer to the DimensionCaption field, and set the ShowLabel property to True.
You can then play around with font sizes, zoom, background colours, line widths etc. to get the effects that you want, but you’ll end up with a star schema visualisation similar to this.

You can also configure actions for each layer. Using this you can hyperlink each dimension box to show the CubeDoc_Dimension report for the selected dimension etc, making the star schema interactive.
This has been quite a fun blog post to investigate, I hope you can take something useful from it and have as much fun with it as I’ve had with it. Every demo that I’ve seen using spatial data has been using maps, hopefully this shows an alternative use beyond geographical mapping.
- Part 1 – Creating the DMV stored procs
- Part 2 – Create the SSRS reports
- Part 3 – Use spatial data and maps to create a star schema view
- Download Source Code
In my previous post I described how to create a number of stored procedures that use Dynamic Management Views (DMVs) to return the metadata structure of an SSAS 2008 OLAP cube, including dimensions, attributes, measure groups, BUS matrix etc.
In this post I’m going to use those procs to create a set of SSRS 2008 reports that will serve as the automated documentation of your cube. I’m going to make the following assumptions:
- You’ve read the part 1 post, and already have the stored procs in a database.
- You know the basics of SSRS 2008
If you haven’t read part 1, you can jump to it here.
- Part 1 – Creating the DMV stored procs
- Part 2 – Create the SSRS reports
- Part 3 – Use spatial data and maps to create a star schema view
- Download Source Code
UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.
Firstly I create a basic template report which has an appropriate header, with date/time stamp, server name and logos etc. This means that all of the reports will have a common look and feel. I could of course make use of the new report parts in SSRS 2008 R2 for this, but to maintain compatibility with pre R2 I’ll keep it simple.

The expression in the box on the top right is just to display today’s date in UK format dd/mm/yyyy.
=FORMAT(Today(),"dd/MM/yyyy")
The reports that we’ll build will include the following:
- CubeDoc_Cubes.rdl – Entry page, list of all cubes in the database
- CubeDoc_Cube.rdl – showing all measure groups and dimensions within the selected cube
- CubeDoc_Dimension.rdl – showing all hierarchies, attributes and related measure groups
- CubeDoc_MeasureGroup.rdl – showing all measures and related dimensions
- CubeDoc_Search.rdl – search names and descriptions of cubes, dimensions, measure groups, etc
Create the first report (CubeDoc_Cubes.rdl) which will act as the entry screen and menu of cubes.
Add a dataset dsCubes, and point it at the stored proc dbo.upCubeDocCubes

The proc has a @Catalog parameter, which filters the result set to a specific SSAS database (catalog). We want to return all cubes from all catalogs, so set the parameter value to =Nothing

All we have to do now is add a table and pull in the dataset fields that we want to see.

We can then preview the report to test that it returns the right list of cubes. You should see something like this.

Note that the AdventureWorks database doesn’t contain any descriptions, so you won’t see any in the report but they will be there when you add descriptions to your own cubes.
The next report we’re going to write is the CubeDoc_Cube report, which will list the measure groups, dimensions and BUS matrix of a single cube. We’ll link the two reports together later on.
Create a new report, using the template report you created earlier (select the template report in the solution explorer window, then CTRL+C then CTRL+V) and rename the new file as CubeDoc_Cube.rdl.
Add a report parameter called @Catalog which should be Text. I’ve set mine to default to “Adventure Works DW 2008R2″ to make testing easier.
Add a dataset called dsCubes, and point it at the dbo.upCubeDocCubes proc, and link the @Catalog dataset parameter to the @Catalog report parameter.

This dataset will query the available cubes for the selected catalog, and populate a new parameter which we’ll now create, called @Cube. This should also be a text parameter, but this time we’ll set the available values to those returned by the dsCubes dataset.

If you want you can also set the default value of the parameter to the CUBE_NAME field of dsCubes. This parameter is not a multi value parameter, so by defaulting it to the dataset it will just default to the first record.
We can now use @Catalog and @Cube parameters to query the available measure groups and dimensions.
So, create a three new datasets:
- dsDimensions – pointing to dbo.upCubeDocDimensionsInCube
- dsMeasureGroups – pointing to dbo.upCubeDocMeasureGroupsInCube
- dsBusMatrix – pointing to dbo.upCubeDocBUSMatrix
Set each of their @Catalog parameters to the report’s @Catalog parameter, and their @Cube parameters to the report’s @Cube parameter.
Create two tables in the report, one for measure groups and one for dimensions. Drag in the fields that you want to see, and preview the report. You should see something like this.

I’ve added a couple of textbox titles for good measure.
The third dataset, dsBUSMatrix requires something a little more interesting. For those that aren’t familiar with Kimball’s BUS Matrix concept, it’s a grid that shows the relationship and connectivity between facts (measure groups) and their dimensions. As the name suggests, we’ll use SSRS’s Matrix control for this. Once you’ve added a matrix control onto the report, follow these steps (using the dsBUSMatrix dataset):
- Drag DIMENSION_UNIQUE_NAME onto the Rows of the matrix
- Drag MEASUREGROUP_NAME onto the columns of the matrix
- Right click on the Data box of the matrix, and type in the expression below. This checks the cardinality of the dimensions/measures to determine whether it is a regular relationship, a many to many or a degenerate fact
=SWITCH(SUM(Fields!Relationship.Value)=0,"", Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"F", Fields!MEASUREGROUP_CARDINALITY.Value="MANY","M", True,"X")
This will either show an X if there is a regular relationship, or show an M or F if there’s a many to many or degenerate relationship respectively.
To make it easier to read, I also like to set the background colour of the Data textbox to highlight the type of relationship further.
=SWITCH(SUM(Fields!Relationship.Value)=0,"Transparent", Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"Yellow", Fields!MEASUREGROUP_CARDINALITY.Value="MANY","Red", True,"CornflowerBlue")
If you preview the report you should see the following

It shows the concept, but it needs a little tidying up. Centering the text in the data textbox helps, but we can also use a fantastic new feature in SSRS 2008 R2 to rotate the column titles. Simply set the WritingMode property in the Localization group to Rotate270, and then shrink the width of the column.

I’ve also added a title, with a key, and level of row grouping using the DIMENSION_MASTER_NAME field, which groups role playing dimensions by their master dimension. It should now look something like this.

That’s it for this report, so save it, then go back to the first report (CubeDoc_Cubes.rdl) and right click, properties on the [CUBE_NAME] textbox. Go to the action tab, and set the action to navigate to the CubeDoc_Cube report, passing through the CATALOG_NAME and CUBE_NAME fields from the dataset as the parameter values. This sets up a hyperlink from one report to the other, allowing users to navigate around the cube doc reports by clicking on what they want to know about.

We then need to do the same for 3 other reports:
CubeDoc_Dimensions
- dbo.upCubeDocAttributesInDimension results in a table
- dbo.upCubeDocMeasureGroupsForDimension results in a table
- Add an extra @Dimension parameter, populated from dbo.upCubeDocDimensionsInCube
CubeDoc_MeasureGroup
- dbo.upCubeDocMeasuresInMeasureGroup results in a table
- dbo.upCubeDocDimensionsForMeasureGroup results in a table
- Add an extra @MeasureGroup parameter, populated from dbo.upCubeDocMeasureGroupsInCube
CubeDoc_Search
- Add an extra @Search parameter, text, with no default
- Table containing results from dbo.upCubeDocSearch, using the @Search parameter
Link all appropriate textboxes (measure groups, dimensions, search etc.) to their relevant report using the report action, and hey presto – a fully automated, real time, self-documenting cube report.
In the next and final installment of this series of blog posts, we’ll explore SQL 2008′s spatial data to generate an automated star schema visualisation to add that little something extra to the reports.
- Part 1 – Creating the DMV stored procs
- Part 2 – Create the SSRS reports
- Part 3 – Use spatial data and maps to create a star schema view
- Download Source Code
Being a business intelligence consultant, I like to spend my time designing data warehouses, ETL scripts and OLAP cubes. An unfortunate consequence of this is having to write the documentation that goes with the fun techy work. So it got me thnking, is there a slightly more fun techy way of automating the documentation of OLAP cubes…
There are some good tools out there such as BI Documenter, but I wanted a way of having more control over the output, and also automating it further so that you don’t have to run an overnight build of the documentation.
I found a great article by Vincent Rainardi describing some DMVs (Dynamic Management Views) available in SQL 2008 which got me thinking, why not just build a number of SSRS reports calling these DMVs, which would then dynamically create the cube structure documentation in real time whenever the report rendered..
This post is the first in a 3 part set which will demonstrate how you can use these DMVs to automate the SSAS cube documentation and user guide.
- Part 1 – Creating the DMV stored procs
- Part 2 – Create the SSRS reports
- Part 3 – Use spatial data and maps to create a star schema view
- Download Source Code
UPDATE: I presented a 1 hour session at SQLBits 8 covering all of this work, you can watch the video here.
There’s a full list of DMVs available in SQL 2008 R2 on the msdn site.
The primary DMVs that are of interest are:
| DMV | Description |
| MDSCHEMA_CUBES | Lists the cubes in an SSAS database |
| MDSCHEMA_MEASUREGROUPS | Lists measure groups |
| MDSCHEMA_DIMENSIONS | Lists dimensions |
| MDSCHEMA_LEVELS | Dimension attributes |
| MDSCHEMA_MEASUREGROUP_DIMENSIONS | Enumerates dimensions of measure groups |
| MDSCHEMA_MEASURES | Lists measures |
When querying DMVs we can use SQL style SELECT statements, but executed against the cube in a DMX window.
SELECT * FROM $SYSTEM.MDSCHEMA_CUBES
This returns a dataset like any other SQL query.

We can even enhance it with DISTINCT and WHERE clauses, although they are more restricted than basic SQL. One of the main limitations is the lack of a JOIN operator. A number of the queries that I’ll perform below need to use JOIN, so to get around this I wrap up each query in an SQL OPENROWSET command, executed against a SQL database with a linked server to the cube. This enables me to perform JOINs using queries such as
SELECT *
FROM OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS') mgd
INNER JOIN OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS') mg
ON mgd.XXX = mg.XXX
etc.
I’m therefore going to create a number of stored procs to wrap up this functionality, the SSRS reports can then just call the procs.
Within BIDS, every item (cube, measure group, measure, dimension, attribute, hierarchy, KPI, etc.) has a description in the properties pane which is a multi-line free text property. These are exposed by the DMVs, so I’m going to make use of them and bring them out in the reports. This allows you to create the descriptions within BIDS as you’re developing the cube, meaning they’re version controlled and always in sync with the code.
I should also point out that I’m using SQL Server 2008 R2. All of the queries below will work with SQL 2008, but I want to use the spatial report functionality of SSRS 2008 R2 to generate dynamic star schema visualisations, which is only supported in R2.
In this post I’ll script out the stored procedures used as the basis of the documentation. In my next post I’ll put these into SSRS reports.
Lets get started.
Firstly we need to create our linked server. This script will create a linked server called CubeLinkedServer pointing to the Adventure Works DW 2008R2 OLAP database on the local server.
EXEC master.dbo.sp_addlinkedserver @server = N'CubeLinkedServer', @srvproduct=N'MSOLAP', @provider=N'MSOLAP', @datasrc=N'(local)', @catalog=N'Adventure Works DW 2008R2'
You’ll have to set up the security according to your requirements. So now lets start creating the source procs.
The first proc lists all of the cubes. The MDSCHEMA_CUBES DMV returns not only cubes, but also dimensions, I’m filtering it to only return cubes by specifying CUBE_SOURCE=1.
CREATE PROCEDURE [dbo].[upCubeDocCubes]
(@Catalog VARCHAR(255) = NULL
)
AS
SELECT *
FROM OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_CUBES
WHERE CUBE_SOURCE = 1')
WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
OR @Catalog IS NULL
GO
The next proc returns all measure groups found within a specified cube.
CREATE PROCEDURE [dbo].[upCubeDocMeasureGroupsInCube]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
)
AS
SELECT *
FROM OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS ')
WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST([CUBE_NAME] AS VARCHAR(255)) = @Cube
GO
This next proc returns a list of measures within a specified measure group.
CREATE PROCEDURE [dbo].[upCubeDocMeasuresInMeasureGroup]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
,@MeasureGroup VARCHAR(255)
)
AS
SELECT * FROM OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_MEASURES
WHERE [MEASURE_IS_VISIBLE]')
WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST([CUBE_NAME] AS VARCHAR(255)) = @Cube
AND CAST([MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
GO
The following proc queries all dimensions available within a specified cube. I’m filtering using the DIMENSION_IS_VISIBLE column to only show visible dimensions.
CREATE PROCEDURE [dbo].[upCubeDocDimensionsInCube]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
)
AS
SELECT * FROM OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]')
WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST([CUBE_NAME] AS VARCHAR(255)) = @Cube
GO
Then we can query all available attributes within a dimension. This DMV returns a bitmask field (LEVEL_ORIGIN) which defines whether the attribute is a key, attribute or hierarchy. I’m using bitwise AND (&) to split this into three seperate fields for ease of use. I’m also filtering out invisible attributes, as well as those with a level of 0. Level 0 is the [All] member of any attribute, which we can ignore for this purpose.
CREATE PROCEDURE [dbo].[upCubeDocAttributesInDimension]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
,@Dimension VARCHAR(255)
)
AS
SELECT *
, CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 1 = 1
THEN 1 ELSE 0 END AS IsHierarchy
, CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 2 = 2
THEN 1 ELSE 0 END AS IsAttribute
, CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 4 = 4
THEN 1 ELSE 0 END AS IsKey
FROM OPENQUERY(CubeLinkedServer,
'SELECT *
FROM $SYSTEM.MDSCHEMA_LEVELS
WHERE [LEVEL_NUMBER]>0
AND [LEVEL_IS_VISIBLE]')
WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST([CUBE_NAME] AS VARCHAR(255)) = @Cube
AND CAST([DIMENSION_UNIQUE_NAME] AS VARCHAR(255)) = @Dimension
GO
The next proc returns measure groups with their associated dimensions. We have to join two DMVs together in order to get the description columns of both the dimension and measure group.
CREATE PROCEDURE [dbo].[upCubeDocMeasureGroupsForDimension]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
,@Dimension VARCHAR(255)
)
AS
SELECT
mgd.*
, m.[DESCRIPTION]
FROM OPENQUERY(CubeLinkedServer,
'SELECT
[CATALOG_NAME]
, [CUBE_NAME]
, [MEASUREGROUP_NAME]
, [MEASUREGROUP_CARDINALITY]
, [DIMENSION_UNIQUE_NAME]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') mgd
INNER JOIN OPENQUERY(CubeLinkedServer,
'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[MEASUREGROUP_NAME]
,[DESCRIPTION]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS') mg
ON CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
= CAST(mg.[CATALOG_NAME] AS VARCHAR(255))
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
= CAST(mg.[CUBE_NAME] AS VARCHAR(255))
AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255))
= CAST(mg.[MEASUREGROUP_NAME] AS VARCHAR(255))
WHERE CAST(mgd.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255)) = @Cube
AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255)) = @Dimension
GO
The next proc is similar to the above, but the opposite way around. It returns all dimensions that are related to a measure group.
CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
(@Catalog VARCHAR(255)
,@Cube VARCHAR(255)
,@MeasureGroup VARCHAR(255)
)
AS
SELECT
mgd.*
, d.[DESCRIPTION]
FROM OPENQUERY(CubeLinkedServer,
'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[MEASUREGROUP_NAME]
,[MEASUREGROUP_CARDINALITY]
,[DIMENSION_UNIQUE_NAME]
,[DIMENSION_CARDINALITY]
,[DIMENSION_IS_VISIBLE]
,[DIMENSION_IS_FACT_DIMENSION]
,[DIMENSION_GRANULARITY]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') mgd
INNER JOIN OPENQUERY(CubeLinkedServer,
'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[DIMENSION_UNIQUE_NAME]
,[DESCRIPTION]
FROM $SYSTEM.MDSCHEMA_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') d
ON CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
= CAST(d.[CATALOG_NAME] AS VARCHAR(255))
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
= CAST(d.[CUBE_NAME] AS VARCHAR(255))
AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
= CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
WHERE CAST(mgd.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255)) = @Cube
AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
GO
The next proc builds a BUS matrix, joining every dimension to its related measure groups. Later we’ll use the SSRS tablix control to pivot this into matrix form.
CREATE PROCEDURE [dbo].[upCubeDocBUSMatrix]
(@Catalog VARCHAR(255),
@Cube VARCHAR(255)
)
AS
SELECT
bus.[CATALOG_NAME]
,bus.[CUBE_NAME]
,bus.[MEASUREGROUP_NAME]
,bus.[MEASUREGROUP_CARDINALITY]
,bus.[DIMENSION_UNIQUE_NAME]
,bus.[DIMENSION_CARDINALITY]
,bus.[DIMENSION_IS_FACT_DIMENSION]
,bus.[DIMENSION_GRANULARITY]
,dim.[DIMENSION_MASTER_NAME]
,1 AS Relationship
FROM
OPENQUERY(CubeLinkedServer,
'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[MEASUREGROUP_NAME]
,[MEASUREGROUP_CARDINALITY]
,[DIMENSION_UNIQUE_NAME]
,[DIMENSION_CARDINALITY]
,[DIMENSION_IS_FACT_DIMENSION]
,[DIMENSION_GRANULARITY]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]') bus
INNER JOIN OPENQUERY(CubeLinkedServer,
'SELECT
[CATALOG_NAME]
,[CUBE_NAME]
,[DIMENSION_UNIQUE_NAME]
,[DIMENSION_MASTER_NAME]
FROM $SYSTEM.MDSCHEMA_DIMENSIONS') dim
ON CAST(bus.[CATALOG_NAME] AS VARCHAR(255))
= CAST(dim.[CATALOG_NAME] AS VARCHAR(255))
AND CAST(bus.[CUBE_NAME] AS VARCHAR(255))
= CAST(dim.[CUBE_NAME] AS VARCHAR(255))
AND CAST(bus.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
= CAST(dim.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
WHERE CAST(bus.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
AND CAST(bus.[CUBE_NAME] AS VARCHAR(255)) = @Cube
GO
Next, in order to make it easier for users to find items within the cube, I’ve created a searching proc which will scour a number of the DMVs for anything containing the search term.
CREATE PROCEDURE [dbo].[upCubeDocSearch]
(@Search VARCHAR(255)
,@Catalog VARCHAR(255)=NULL
,@Cube VARCHAR(255)=NULL
)
AS
WITH MetaData AS
(
--Cubes
SELECT CAST('Cube' AS VARCHAR(20)) AS [Type]
, CAST(CATALOG_NAME AS VARCHAR(255)) AS [Catalog]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Cube]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Name]
, CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Link]
FROM OPENQUERY(CubeLinkedServer,
'SELECT [CATALOG_NAME], [CUBE_NAME], [DESCRIPTION]
FROM $SYSTEM.MDSCHEMA_CUBES
WHERE CUBE_SOURCE = 1')
WHERE (CAST([CATALOG_NAME] AS VARCHAR(255))
= @Catalog OR @Catalog IS NULL)
UNION ALL
--Dimensions
SELECT CAST('Dimension' AS VARCHAR(20)) AS [Type]
, CAST(CATALOG_NAME AS VARCHAR(255)) AS [Catalog]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Cube]
, CAST(DIMENSION_NAME AS VARCHAR(255)) AS [Name]
, CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
, CAST(DIMENSION_UNIQUE_NAME AS VARCHAR(255)) AS [Link]
FROM OPENQUERY(CubeLinkedServer,
'SELECT [CATALOG_NAME], [CUBE_NAME]
, [DIMENSION_NAME], [DESCRIPTION]
, [DIMENSION_UNIQUE_NAME]
FROM $SYSTEM.MDSCHEMA_DIMENSIONS
WHERE [DIMENSION_IS_VISIBLE]')
WHERE (CAST([CATALOG_NAME] AS VARCHAR(255))
= @Catalog OR @Catalog IS NULL)
AND (CAST([CUBE_NAME] AS VARCHAR(255))
= @Cube OR @Cube IS NULL)
AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
<>'$' --Filter out dimensions not in a cube
UNION ALL
--Attributes
SELECT CAST('Attribute' AS VARCHAR(20)) AS [Type]
, CAST(CATALOG_NAME AS VARCHAR(255)) AS [Catalog]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Cube]
, CAST(LEVEL_CAPTION AS VARCHAR(255)) AS [Name]
, CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
, CAST(DIMENSION_UNIQUE_NAME AS VARCHAR(255)) AS [Link]
FROM OPENQUERY(CubeLinkedServer,
'SELECT [CATALOG_NAME], [CUBE_NAME]
, [LEVEL_CAPTION], [DESCRIPTION],
, [DIMENSION_UNIQUE_NAME]
FROM $SYSTEM.MDSCHEMA_LEVELS
WHERE [LEVEL_NUMBER]>0
AND [LEVEL_IS_VISIBLE]')
WHERE (CAST([CATALOG_NAME] AS VARCHAR(255))
= @Catalog OR @Catalog IS NULL)
AND (CAST([CUBE_NAME] AS VARCHAR(255))
= @Cube OR @Cube IS NULL)
AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
<>'$' --Filter out dimensions not in a cube
UNION ALL
--Measure Groups
SELECT CAST('Measure Group' AS VARCHAR(20)) AS [Type]
, CAST(CATALOG_NAME AS VARCHAR(255)) AS [Catalog]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Cube]
, CAST(MEASUREGROUP_NAME AS VARCHAR(255)) AS [Name]
, CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
, CAST(MEASUREGROUP_NAME AS VARCHAR(255)) AS [Link]
FROM OPENQUERY(CubeLinkedServer,
'SELECT [CATALOG_NAME], [CUBE_NAME]
, [MEASUREGROUP_NAME],
, [DESCRIPTION]
FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS')
WHERE (CAST([CATALOG_NAME] AS VARCHAR(255))
= @Catalog OR @Catalog IS NULL)
AND (CAST([CUBE_NAME] AS VARCHAR(255))
= @Cube OR @Cube IS NULL)
AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
<>'$' --Filter out dimensions not in a cube
UNION ALL
--Measures
SELECT CAST('Measure' AS VARCHAR(20)) AS [Type]
, CAST(CATALOG_NAME AS VARCHAR(255)) AS [Catalog]
, CAST(CUBE_NAME AS VARCHAR(255)) AS [Cube]
, CAST(MEASURE_NAME AS VARCHAR(255)) AS [Name]
, CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
, CAST(MEASUREGROUP_NAME AS VARCHAR(255)) AS [Link]
FROM OPENQUERY(CubeLinkedServer,
'SELECT [CATALOG_NAME], [CUBE_NAME]
, [MEASURE_NAME], [DESCRIPTION],
, [MEASUREGROUP_NAME]
FROM $SYSTEM.MDSCHEMA_MEASURES
WHERE [MEASURE_IS_VISIBLE]')
WHERE (CAST([CATALOG_NAME] AS VARCHAR(255))
= @Catalog OR @Catalog IS NULL)
AND (CAST([CUBE_NAME] AS VARCHAR(255))
= @Cube OR @Cube IS NULL)
AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
<>'$' --Filter out dimensions not in a cube
)
SELECT *
FROM MetaData
WHERE @Search<>''
AND ([Name] LIKE '%' + @Search + '%'
OR [Description] LIKE '%' + @Search + '%'
)
GO
We can now use these procs to form the basis of a number of SSRS reports which will dynamically query the DMVs to generate the SSAS cube documentation. I’ll be covering this stage in my next post.
- Part 1 – Creating the DMV stored procs
- Part 2 – Create the SSRS reports
- Part 3 – Use spatial data and maps to create a star schema view
- Download Source Code
News Flash: Purple Frog now has a Tumblr page
Thank you to the BCS Shropshire branch for opening your doors to the Purple Frog team. We’re delighted to be given any chance to talk about Business Intelligence, and spread the word to increase awareness of what it is and how it all works. We find it a fascinating world to work in, and hope that there were some converts in the audience!
For those that asked, here’s a copy of the slides for you to download. As the presentation was largely demonstration based, we’ve taken some screenshots of the key points of the demo and included them within the slides. We did try and take a video of the event, but unfortunately the camera didn’t want to play ball so we’ll have to stick with the slides.
We’re always looking for other groups that are interested in learning about BI, and are more than happy to run talks (techy or not) so let us know if you’re interested.
On Monday 20th September, Alex & Hollie from Purple Frog Systems will be giving a talk on business intelligence to the Shropshire branch of the British Computer Society.
The event is free to attend, even for non BCS members, and will be held from 6.15pm at the Telford campus of Wolverhampton Uni.
We’ll be providing an introduction to BI, explaining what it is, how it works and how it can benefit your organisation.
We’ll present a fully working demonstration of how to turn a simple list of customers/members into a highly interactive information system, using the latest in spatial mapping techniques and OLAP cubes.
Register for free at the BCS events website
For those that haven’t yet heard of DAX, it’s an expression language developed by Microsoft to perform calculations against PowerPivot. Stepping back one step further, PowerPivot is essentially a local Analysis Services cube that runs within Excel 2010.
I’ve heard plenty of comments from various sources about how DAX is the [multi-dimensional] query language of the future and how it’s going to kill off MDX. Ok…. well no, it’s not ok.
For starters they both exist for two different purposes. DAX is not a query language but an expression language. You can use MDX to query information from a cube and generate a pivot, you can’t with DAX as it is not a query language.
The best way to think of DAX is as an extension to Excel formulas, you can use it to perform a calculation against an existing set of cells, in this case, a PowerPivot dataset.
It is also similar to an MDX calculated member, and in fact supports a number of MDX functions (TotalYTD, ParallelPeriod etc.).
If your data is in a database: You would use SQL to query data from a database and import the results into Excel. You would then use Excel expressions/calculations to enhance the data.
If your data is in a cube: You would use an Excel pivot (or MDX query) to query the data and import the results into Excel. You then have to use a third party tool such as OLAP PivotTable Extensions to add expressions/calculations to enhance the data.
If your data is in PowerPivot: You would use PowerPivot to query the data and import the results into Excel. You would then use DAX to add calculations to enhance the data.
DAX is a fantastic expression tool, and one that provides significant power to PowerPivot, but no, it won’t replace MDX. My hope is that Microsoft will provide DAX capability for MDX queries as well, and not restrict it to PowerPivot queries. As I’ve shown in my previous blog post it’s a great expression language that would provide significant benefit to cube users.
In my previous post I explained how to create a calculated MDX member that projects full year data (sales etc.) based on existing year to date data.
In this post I’ll be doing exactly the same but in DAX, the new expression language used to enhance PowerPivot data. As it’s the same desired outcome, I’m not going to repeat the background, you’ll have to look at my previous post for that.
The expressions below assume that you have a single table ‘Sales’ with a [Date] column and an [Internet Sales Amount] column.
Step 1 – What are our total sales so far this year?
We use the TOTALYTD function to work this out for us
=TOTALYTD(SUM(Sales[Internet Sales Amount])
,Sales[Date], all(Sales))
The first parameter is the expression we want to calculate, i.e. the sum of internet sales.
The second parameter specifies the date we’re calculating up to.
The third parameter is one that catches a lot of people out. We have to tell DAX the context of the date. As it stands the expression can only see the data in the selected row, by specifing all(Sales) in the filter we expand the expression to be able to look at all of the data.
Step 2 – How far are we though the year?
It’s here where DAX really shows an improvement in the functions available over and above what’s available in MDX. There’s a YEARFRAC function which calculates how far we are through the year.
=YEARFRAC(
CONCATENATE("01/01/"
,year(Sales[Date]))
,Sales[Date]))
The first parameter is the start date, i.e. the 1st January. We have to build this using the year of the selected row to ensure we get the right year.
The second parameter is the date of the record we’re looking at, Sales[Date].
Step 3 – Project the value to the end of the year
We combine the two values by simply dividing the YTD figure by how far we are through the year
=TOTALYTD(SUM(Sales[Internet Sales Amount])
,Sales[Date], all(Sales))
/ YEARFRAC(
CONCATENATE("01/01/"
, year(Sales[Date]))
,Sales[Date]))

This chart shows how the full year run rate is adjusted throughout the year as the cumulative sales to date grows. At the start of the year it’s quite volatile, but from February it settles down with an acurate projection.
And it really is as easy as that.
Frog-Blog Out
This post explains how to create an MDX calculated member that will take a value from the cube and project it forward to the end of the year. This provides a simple mechanism for calculating what your expected total will be at year end, based upon current performance.
To do this more accurately you should use time series data mining models in SSAS and use DMX expressions to query the results, but this method is very simple and requires little effort, and will be pretty accurate so long as the data you’re modelling is fairly linear. Please note though that the more cyclical and seasonal your data is the less effective this will be.
The basic idea is that we take what we have done so far (i.e. year to date sales), look at how far through the year we are, and extrapolate the value of future months (or days/weeks/etc.) based upon values so far.
i.e. If we’re at March month end and we’ve sold 100 widgets so far this year, we’re 1/4 of the way through the year so we multiply 100 by 4 and get a prejected yearly total of 400.

This chart shows the concept of what we’re doing, and shows the full year prejections calculated in March (with 3 months of available data) and June (6 months of data). The projections obviously get more accurate the further you are through the year.
One of the points to note is that when creating a calculation like this, based upon a time dimension, the calculation should always work with any level of the dimension hierarchy selected. i.e. The user shouldn’t care whether they’re looking at a month, week, quarter or a day, the calculation should always work the same. To achieve this we simply use the .currentmember of the time hierarchy.
The following examples are based upon projecting the Internet Sales Amount measure found within the SQL Server 2008 Adventure Works DW sample cube.
Step 1 – What are our total sales so far this year?
MDX helpfully provides us with the YTD function which takes care of this for us.
MEMBER [Measures].[YTD Sales] AS
AGGREGATE(
YTD([Date].[Calendar].CurrentMember)
,[Measures].[Internet Sales Amount])
This takes the current member of the Calendar hierarchy, and creates a set of all dates before it (this year) using YTD. It then aggregates (in this case sums) the Internet Sales Amount for all of these dates to calculate YTD Sales.
Step 2 – Which period are we in?
Here we’ll use the same YTD function to create a set of all dates so far this year, but in this case we’ll count the number of resulting members. Note that because we’re using the .CurrentMember of the hierarchy, it doesn’t matter if we’re looking at a date, week or month, the MDX will work. i.e. If we’re looking at 21 Jan it will return 21. If we’re looking at Q3 it will return 3, August will return 8 etc.
MEMBER [Measures].[CurPeriod] AS
COUNT(
YTD([Date].[Calendar].CurrentMember)
,INCLUDEEMPTY)
Step 3 – How many periods are in the year?
If we coded this to only work with months then we could hard code this to 12 however we need to keep it generic to all levels of the hierarchy. So, we have to count all the cousins of the current time member [within this year]. Unfortunately there isn’t a Cousins function in MDX, and Siblings will only return other members within the same parent. i.e. siblings of May 4th would include May 1 through to May 31. To get around this we find the year of the current member by using the Ancestor function.
ANCESTOR([Date].[Calendar].CurrentMember
, [Date].[Calendar].[Calendar Year])
Then we find all of the descendants of the year, at the same level of the hierarchy (week/day/etc.) as the current member. We can then take a count as before.
MEMBER [Measures].[TotalPeriods] AS
COUNT(
DESCENDANTS(
ANCESTOR([Date].[Calendar].CurrentMember
,[Date].[Calendar].[Calendar Year])
,[Date].[Calendar].CurrentMember.level)
,INCLUDEEMPTY)
Step 4 – Calculate the Run Rate
Calculating the prejected yearly total (run rate) is then a simple calculation
MEMBER [Measures].[Full Year Run Rate] AS
[Measures].[YTD Sales]
* ([Measures].[TotalPeriods]
/[Measures].[CurPeriod])
You can then put the whole lot together and see the results…
WITH
MEMBER [Measures].[YTD Sales] AS
AGGREGATE(
YTD([Date].[Calendar].CurrentMember)
,[Measures].[Internet Sales Amount])
MEMBER [Measures].[CurPeriod] AS
COUNT(
YTD([Date].[Calendar].CurrentMember)
,INCLUDEEMPTY)
MEMBER [Measures].[TotalPeriods] AS
COUNT(
DESCENDANTS(
ANCESTOR([Date].[Calendar].CurrentMember
,[Date].[Calendar].[Calendar Year])
,[Date].[Calendar].CurrentMember.level)
,INCLUDEEMPTY)
MEMBER [Measures].[Full Year Run Rate] AS
[Measures].[YTD Sales]
* ([Measures].[TotalPeriods]
/[Measures].[CurPeriod])
SELECT
{
[Measures].[Internet Sales Amount]
,[Measures].[YTD Sales]
,[Measures].[Full Year Run Rate]
,[Measures].[CurPeriod]
,[Measures].[TotalPeriods]
} ON 0,
{
DESCENDANTS([Date].[Calendar].[CY 2003])
} ON 1
FROM [Direct Sales]
In my next blog I’ll be diong the same calculation in DAX for use with PowerPivot, stay tuned…
Frog-Blog Out
I thought I’d take a break from writing posts about Business Intelligence and SQL Server, and instead share with you my elation at finding a laptop hard disk that quite simply makes the world a better place, the Seagate Momentus XT hybrid drive.
When I purchased my curent laptop (Dell XPS M1530 if you’re interested, with 4Gb RAM) I was presented with a choice between a fast 7200rpm 200Gb drive or a slower 5400rpm 320Gb drive. Due to the size of the databases I tend to work with I had to opt for the larger of the two, a Western Digital Caviar, taking the hit on performance.
I’ve been tempted for a while to upgrade the disk to a 7200rpm but have been secretly holding out (in vain) for solid state disks to increase in size and performance whilst decrease in price. £600 for a 256Gb SSD still renders them too expensive and too small to be an effective option for my needs. 512Gb drives are expected soon, but with a price tag of over £1000. No thanks.
Enter Seagate, with their Momentus XT hybrid drive which is now available in the UK. The 500Gb version (also available in 250Gb and 320Gb) is a standard laptop sized 2.5″ drive which combines 4Gb of super fast SLC NAND solid state storage alongside a 500Gb traditional 7200rpm drive. It also has 32Mb of drive-level cache. The drive monitors disk usage and automatically uses the SSD for the most commonly used files, without any help or drivers on the operating system. Thus you get the size/cost benefit of a standard drive but the performance boost of an SSD for your most accessed files. And all this for less than £100… How could I resist?!
After a weekend of reinstalling Windows 7 Ultimate (x64), Office 2010, SQL 2008 R2 and the usual plethora of other software, the results are quite simply staggering. My previous setup would let me login to Windows after 60 seconds, but I had to wait a total of 7.5 minutes until Outlook was open and usable. In the new setup I can login to Windows after 35 seconds, and Outlook is open and usable in under 1.5 minutes. 6 minutes saved per day just on bootup. That’s a whole 24 hours per year.
I have to place a caveat here, that there are a number of software differences between the two systems so it’s not by any means a scientific test. My old system was XP Pro x86 and the new one is Windows 7 Ultimate x64, I’ve changed SQL Server 2008 to 2008R2, and all the drivers/software are 64 bit instead of 32 bit. This will certainly make a difference on its own so the performance is not entirely down to the drive, however I have to assume that it takes the majority of the credit. Every detailed review that I’ve seen reports average performance as pretty much mid-way between a 7200rpm and a SSD disk.
The only downside is that I’ve now got to spend a few more weekends upgrading the other company laptops!

I specialise in designing and implementing SQL Server business intelligence solutions,
and this is my blog! Just a collection of thoughts, techniques and ramblings on SQL Server, Cubes, Data Warehouses, MDX, DAX and whatever else comes to mind.
