0845 643 64 63

Cube

Renaming an SSAS Tabular Model

I came across a frustrating problem today. I’d just finished processing a large tabular cube (SQL Server 2012), which had taken 11 hours in total.

On trying to connect to the cube to test it, I’d made a schoolboy error; The database was named correctly, but the model inside it was named MyCubeName_Test instead of MyCubeName. No problem, I’ll just right click the cube in SSMS and rename it. Well, no, there is no option to rename a model, just the database. I didn’t fancy doing a full reprocess, but luckily a little digging in the xml files presented a solution.

  1. Detach the cube
  2. Open up the cube’s data folder in explorer (x:\xx\OLAP\data\MyCubeName.0.db, or whatever it happens to be in your case)
  3. Find the Model.xx.cub.xml file, and open it in Notepad++ (other text editors are available…)
  4. Search for the <Name> tag, and just change the name inside it
  5. Save the file and close it
  6. Re-attach the cube

Simples

Frog-Blog-Out

SSAS Tabular performance – DefaultSegmentRowCount

I’m currently investigating a poorly performing Tabular model, and came across some interesting test results which seem to contradict the advice in Microsoft’s Performance Tuning of Tabular Models white paper.

Some background:

  • 7.6Gb SSAS tabular cube, running on a 2 x CPU 32 core (Xeon E5-2650 2Ghz, 2 x NUMA nodes) server with 144Gb RAM
  • SQL Server 2012 SP1 CU7 Enterprise
  • 167m rows of data in primary fact
  • 80m distinct CustomerKey values in primary fact
  • No cube partitioning

A simple distinct count in DAX of the CustomerKey, with no filtering, is taking 42 seconds on a cold cache. Far too slow for a tabular model. Hence the investigation.

p88 of the Performance Tuning of Tabular Models white paper discusses the DefaultSegmentRowCount, explaining that it defaults to 8m, and that there should be a correlation between the number of cores and the number of segments. [The number of segments calculated as the number of rows divided by the segment size].

It also indicates that a higher segment size may increase compression, and consequently query performance.

Calculating the number of segments for our data set, gives us the following options:

Rows 167,000,000
Segment Size # Segments
1048576 169
2097152 80
4194304 40
[default] 8388608 20
16777216 10
33554432 5
67108864 3

So, with 32 cores to play with, we should be looking at the default segment size (8m) or maybe reduce it to 4m to get 40 segments. But the extra compression with 16m segment size may be of benefit. So I ran some timing tests on the distinct count measure, and the results are quite interesting.

DefaultSegmentRowSize

It clearly shows that in this environment, reducing the DefaultSegmentRowSize property down to 2m improved the query performance (on a cold cache) from 42s down to 27s – 36% improvement. As well as this, processing time was reduced, as was compression.

This setting creates 80 segments, 2.5 times the number of cores available, but achieved the best performance. Note that the server’s ProcessingTimeboxSecPerMRow setting has been set to 0 to allow for maximum compression.

There’s more to this systems’s performance problems than just this, NUMA for a start, but thought I’d throw this out there in case anyone else is blindly following the performance tuning white paper without doing your own experimentation.

Each environment, data set and server spec is different, so if you need to eek out the last ounce of performance, run your own tests on the SSAS settings and see for yourself.

Frog-Blog Out

[Update: Follow up post exploring the performance impact of NUMA on this server]

Video: Automating SSAS OLAP Cube documentation

Automating OLAP cube documentation – SQLBits presentation

For anyone that missed my presentation at SQLBits 8 in April, the video is now available here.

In this 1 hour session I present a method of automating the creation of documentation for SSAS OLAP cubes by using DMVs (dynamic management views) and spatial data, querying the metadata of the cube in realtime.

The results include the BUS matrix, star schemas, attribute lists, hierarchies etc. and are all presented in SSRS.

The blog posts to go with this are here:

You can view the slide deck here

Debug MDX queries using Drillthrough in SSMS

One of the great features of using Excel to browse an SSAS OLAP cube is the drillthrough ability. If you double click on any cell of an OLAP pivot table, Excel will create a new worksheet containing the top 1000 fact records that went to make up the figure in the selected cell.

N.B. The limit of 1000 rows can be altered, as per one of my previous blog posts here.

This feature is pretty well known, but not many folk realise how easy it is to reproduce this in SQL Server Management Studio (SSMS). All you need to do is prefix your query with DRILLTHROUGH.

i.e. Assuming an MDX query of

SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

Which returns the following results…

A query of

DRILLTHROUGH
SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

Returns the records contributing to the total figure. Great for diagnosing problems with an MDX query.

By default, only the first 10,000 rows are returned, but you can override this using MAXROWS

DRILLTHROUGH MAXROWS 500
SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

The columns that are returned are those defined in the Actions tab of the Cube Designer in BIDS (The Business Intelligence Development Studio).

If no action is defined, then the fact measures will be returned along with the keys that link to each relevant dimension, which tend not to be that helpful.

You can override the returned columns by using the RETURN clause

DRILLTHROUGH SELECT [Measures].[Internet Sales Amount] ON 0 FROM [Adventure Works] WHERE [Date].[January 1, 2004] RETURN [$Internet Sales Order Details].[Internet Sales Order] ,[$Sales Territory].[Sales Territory Region] ,NAME([$Product].[Product]) ,KEY([$Product].[Product]) ,UNIQUENAME([$Product].[Product]) ,[Internet Sales].[Internet Sales Amount] ,[Internet Sales].[Internet Order Quantity]


Note that there are some restrictions on what you can drill through

  • You can’t drill through an expression/calculation, only a raw measure
  • The MDX query needs to return a single cell (otherwise the cube does not know which one to drill through)
  • The data returned will be at the lowest granularity of the cube’s fact table

To explain the last point further, the cube does not return the raw data from the underlying data warehouse, but a summary of the facts grouped by unique combination of the relevant dimensions. i.e. if a warehouse table containing individual sales (by date, product, customer & store) is brought into a cube as a fact table that only has relationships with the date and product dimensions, then the cube drill through will return unique combinations of date and product, summarising sales for each combination. Extra granularity which the warehouse may contain (customer and store) will not be available.

Note that if you specify the RETURN columns, the rows are still returned at the lowest level of the fact table granularity, even if not all of the dimensions are brought out as columns. This may result in returning multiple identical records. Don’t worry, these will be distinct facts, just differentiated by a dimension/attribute that isn’t being returned.

You can find out more on TechNet here

Frog-Blog Out

How to add calculations in SSAS cubes

One of the most useful aspects of a Business Intelligence system is the ability to add calculations to create new measures. This centralises the logic of the calculation into a single place, ensuring consistency and standardisation across the user base.

By way of example, a simple calculation for profit (Income – Expenditure) wouldn’t be provided by the source database and historically would be implemented in each and every report. In a data warehouse and/or cube we can create the calculation in a single place for everyone to use.

This post highlights some of methods of doing this, each with their respective pros and cons.

Calculated Members in SSAS Cube


SSAS provides a ‘Calculations’ tab in the cube designer which allows you to create new measures using MDX. You can use any combination of existing measures and dimension attributes, along with the plethora of MDX functions available to create highly complex calculations.
Pros:

  • Very complex calculations can be created using all available MDX functions
  • No changes are required to the structure of the data warehouse
  • Changes to the calculation will apply to every record, historic and new
  • The results are not stored in the warehouse or cube, so no extra space is required
  • New calculations can be added without having to deploy or reprocess the cube
  • Calculations can be scoped to any level of aggregation and granularity. Different calculations can even be used for different scopes
  • Calculations can easily combine measures from different measure groups

Cons:

  • The calculation will not make use of SSAS cube aggregations, reducing performance
  • SSAS drill through actions will not work
  • The calculation results are not available in the data warehouse, only the cube

SQL Calculations in the Data Source View


There’s a layer in-between the data warehouse and the cube called the data source view (DSV). This presents the relevant tables in the warehouse to the cube, and can be used to enhance the underlying data with calculations. This can either be the dsv layer within the cube project, or I prefer to create SQL Server views to encapsulate the logic.
Pros:

  • No changes are required to the table structure of the data warehouse
  • Calculations use SQL not MDX, reducing the complexity
  • Changes to the calculation will apply to every record, historic and new
  • The calculation will make full use of SSAS cube aggregations
  • SSAS drill through actions will work
  • The results are not stored in the warehouse, so the size of the database does not increase

Cons:

  • The cube must be redeployed and reprocessed before the new measure is available
  • The results of the calculation must be valid at the granularity of the fact table
  • The calculation results are not available in the data warehouse, only the cube

Calculate in the ETL process


Whilst bringing in data from the source data systems, it sometimes makes sense to perform calculations on the data at that point, and store the results in the warehouse.
Pros:

  • The results of the calculation will be available when querying the warehouse as well as the cube
  • In the ETL pipeline you can import other data sources (using lookups etc.) to utilise other data in the calculation
  • If the calculation uses time based data, or data valid at a specific time (i.e. share price) then by performing the calculation in the ETL, the correct time based data is used, without having to store the full history of the underlying source data
  • The calculation will make full use of SSAS cube aggregations
  • SSAS drill through actions will work

Cons:

  • You have to be able to alter the structure of the data warehouse, which isn’t always an option.
  • The results are stored in the warehouse, increasing the size of the database
  • The results of the calculation must be valid at the granularity of the fact table
  • If the calculation logic changes, all existing records must be updated

In Conclusion

If the calculation is valid for a single record, and it would be of benefit to have access to the results in the warehouse, then perform the calculation in the ETL pipeline and store teh results in the warehouse.

If the calculation is valid for a single record, and it would not be of benefit to have the results in the warehouse, then calculate it in the data source view.

If the calculation is too complex for SQL, requiring MDX functions, then create an MDX calculated measure in the cube.

OLAP Cube Documentation in SSRS part 3

This is the 3rd and final post in this series of blog posts, showing how you can use SQL Server Reporting Services (SSRS), DMVs and spatial data to generate real time automated user guide documentation for your Analysis Services (SSAS) OLAP cube.

UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.

In this post, I’m going to enhance the measure group report to include a visualisation of the star schema. To do this I’ll be enhancing one of the stored procedures to utilise SQL Server 2008’s new spatial data types combined with SSRS 2008 R2’s new map functionality.

To do this we’ll update the dbo.upCubeDocDimensionsForMeasureGroup stored proc so that it returns a SQL geometry polygon for each row, in the right place around the circumference of the star. There’s a little math in this, but nothing more than a bit of trigonometry.

First the theory. We have an arbitrary number of dimensions that we need to place in a circle around a central point (the measure group). If we have 6 dimensions, then we need to divide the whole circle (360 degrees) by 6 (=60 degrees each) to get the angle of each around the hypothetical axis.

Therefore the first dimension needs to be at 60, the second at 120, the third at 180 etc, with the 6th at 360, completing the full circle.
Obviously the angle needs to vary depending on the number of dimensions in the query, so we need to calculate it within the stored proc. To do this I’m using common table expressions (CTE) to perform further calculations on the basic query.

We wrap the original proc query into a CTE and call it BaseData. We also add an extra field called Seq, which uniquely identifies each row, we’ll use this later to enable us to rank the dimensions.

;WITH BaseData AS
(
    SELECT
          mgd.*
        , d.[DESCRIPTION]
        , REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS DimensionCaption
        , REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS MeasureGroupCaption
    FROM OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                   +[CUBE_NAME]
                   +[MEASUREGROUP_NAME]
                   +[DIMENSION_UNIQUE_NAME] AS Seq
                , [CATALOG_NAME]
                , [CUBE_NAME]
                , [MEASUREGROUP_NAME]
                , [MEASUREGROUP_CARDINALITY]
                , [DIMENSION_UNIQUE_NAME]
                , [DIMENSION_CARDINALITY]
                , [DIMENSION_IS_VISIBLE]
                , [DIMENSION_IS_FACT_DIMENSION]
                , [DIMENSION_GRANULARITY]
            FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') mgd
        INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                ,[CUBE_NAME]
                ,[DIMENSION_UNIQUE_NAME]
                ,[DESCRIPTION]
            FROM $SYSTEM.MDSCHEMA_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') d
                ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
                  = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
                AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
                  = CAST(d.[CUBE_NAME] AS VARCHAR(255))
                AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
                  = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))     = @Catalog
       AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))         = @Cube
       AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)

We’ll then add a new CTE which calculated the number of records returned by the previous query.

,TotCount AS
(
    SELECT COUNT(*) AS RecCount FROM BaseData
)

Next we cross join TotCount with the base data, so that every row has the extra RecCount field. We then rank each record, providing each with a unique number from 1 to n.

, RecCount AS
(
    SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
        , RecCount
        , BaseData.*
    FROM
        BaseData CROSS JOIN TotCount
)

Each record now contains its row number, as well as the total number of rows, so it’s easy to calculate its position around the circle (rank/n * 360). Now we have that, calculating the x and y coordinates of each dimension is simply a case of applying Sine and Cosine. Note that the SQL SIN and COS functions expect angles to be provided in radians not degrees, so we have to use the RADIANS function to convert it for us. I’m also multiplying the result by 1000 to scale the numbers up from -1 to +1 to -1000 to +1000, which makes our life easier later on.

, Angles AS
(
    SELECT
        *
        , SIN(RADIANS((CAST(RecID AS FLOAT)
            /CAST(RecCount AS FLOAT))
            * 360)) * 1000 AS x
        , COS(RADIANS((CAST(RecID AS FLOAT)
            /CAST(RecCount AS FLOAT))
            * 360)) * 1000 AS y
    FROM RecCount
)

We can now use the x and y coordinates to create a point indicating the position of each dimension, using the code below.

geometry::STGeomFromText('POINT('
   + CAST(y AS VARCHAR(20))
   + ' '
   + CAST(x AS VARCHAR(20))
   + ')',4326) AS Posn

This is a good start, but we want a polygon box, not a single point. We can use a similar geometry function to create a polygon around our point.

geometry::STPolyFromText('POLYGON ((' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
 	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
      + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
   ))',0) AS Box

You’ll notice that I’m multiplying the y axis by a @Stretch variable. This is to allow us to squash or squeeze the resulting star to make it look better in the report. I’m also using a @BoxSize variable which we can use to change the relative size of the boxes. It’s for this reason why I like to work on a -1000 to +1000 scale, it means we can have an integer box size of say 250 instead of a fraction such as 0.25, I just think it’s easier to read.
So you’ll now have a stored proc similar to this.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
    (@Catalog       VARCHAR(255)
    ,@Cube          VARCHAR(255)
    ,@MeasureGroup  VARCHAR(255)
    )
AS

 DECLARE @BoxSize INT
 DECLARE @Stretch FLOAT
 SET @BoxSize = 250
 SET @Stretch = 1.4

;WITH BaseData AS
(
    SELECT
          mgd.*
        , d.[DESCRIPTION]
        , REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS DimensionCaption
        , REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS MeasureGroupCaption
    FROM OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                   +[CUBE_NAME]
                   +[MEASUREGROUP_NAME]
                   +[DIMENSION_UNIQUE_NAME] AS Seq
                , [CATALOG_NAME]
                , [CUBE_NAME]
                , [MEASUREGROUP_NAME]
                , [MEASUREGROUP_CARDINALITY]
                , [DIMENSION_UNIQUE_NAME]
                , [DIMENSION_CARDINALITY]
                , [DIMENSION_IS_VISIBLE]
                , [DIMENSION_IS_FACT_DIMENSION]
                , [DIMENSION_GRANULARITY]
            FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') mgd
        INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                ,[CUBE_NAME]
                ,[DIMENSION_UNIQUE_NAME]
                ,[DESCRIPTION]
            FROM $SYSTEM.MDSCHEMA_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') d
                ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
                  = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
                AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
                  = CAST(d.[CUBE_NAME] AS VARCHAR(255))
                AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
                  = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))      = @Catalog
        AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))         = @Cube
        AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)
,TotCount AS
(
    SELECT COUNT(*) AS RecCount FROM BaseData
)
, RecCount AS
(
    SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
        , RecCount
        , BaseData.*
    FROM
        BaseData CROSS JOIN TotCount
)
, Angles AS
(
    SELECT
        *
        , SIN(RADIANS((CAST(RecID    AS FLOAT)
             /CAST(RecCount AS FLOAT))
             * 360)) * 1000 AS x
        , COS(RADIANS((CAST(RecID AS FLOAT)
             /CAST(RecCount AS FLOAT))
             * 360)) * 1000 AS y
    FROM RecCount
)
,Results AS
(
    SELECT
        *
        , geometry::STGeomFromText('POINT('
            + CAST(y AS VARCHAR(20))
            + ' '
            + CAST(x AS VARCHAR(20))
            + ')',4326) AS Posn
        , geometry::STPolyFromText('POLYGON ((' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS Box
    FROM Angles
)
SELECT * FROM Results
GO

If you then execute this in Management Studio, you’ll notice an extra tab in the result window called Spatial Results.

EXEC [dbo].[upCubeDocDimensionsForMeasureGroup]
    @Catalog = 'Adventure Works DW 2008R2',
    @Cube = 'Adventure Works',
    @MeasureGroup = 'Financial Reporting'

Click on the Spatial Results tab, then select Box as the spatial column, and you’ll see the boxes that we’ve created in a preview window.

This is now getting somewhere close. But as well as the dimensions, we also want to show the measure group in the middle, as well as lines linking them together to actuallly create our star. We can do this by adding a couple more geometry functions to our query. We end up with the end of our proc looking like this.

,Results AS
(
    SELECT
        *
        , geometry::STGeomFromText('POINT('
            + CAST(y AS VARCHAR(20))
            + ' '
            + CAST(x AS VARCHAR(20))
            + ')',4326) AS Posn
        , geometry::STPolyFromText('POLYGON ((' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS Box
         , geometry::STLineFromText('LINESTRING (0 0, '
              + CAST((y*@Stretch) AS VARCHAR(20))
              + ' ' + CAST(x AS VARCHAR(20))
              + ')', 0) AS Line
         , geometry::STPolyFromText('POLYGON ((' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS CenterBox

    FROM Angles
)
SELECT * FROM Results
GO

So, we’ve now got the polygons and lines being generated by the proc, it’s now time to add them into the report and display them to our users.

Firstly open up the CubeDoc_MeasureGroup.rdl report, and go to the properties of the dsDimensions dataset. Click Refresh Fields and the new x, y, Posn, Box, Line and CenterBox fields should now be available.
Then drag a Map from the toolbox onto the report. This will start the map wizard.
Select SQL Server Spatial Query as the source for the data and click Next.
Choose dsDimensions as the dataset to use for the data and click Next.
Choose Box as the spatial field, and Polygon as the layer type. It may well give you an error, just ignore it.
Don’t select the embed map or Bing maps layer.

Click Next, then select a basic map, select the default options for the remaining wizard stages and you’ll end up with a map in your report.

If you preview the report at this stage you won’t see the polygons. This is because the map still thinks it’s a geographical map, and it is trying to draw our boxes as latitude and longitudes. We don’t want this, but want it to show them on our own scale. To fix this, just change the CoordinateSystem property of the map from Geographic to Planar.

You can then preview the report, which should show you something like this

It still doesn’t look like a star but we’ve still got a few more changes to make. We need to add a couple more layers to the map, the center box for the measure group and then the lines to link them all together.
Add a new polygon layer to the map and then set the layer data to use the CenterBox field from the dsDimensions dataset.

Repeat the above, but with a new line layer instead of a polygon layer. Set the layer data to use the Line field of dsDimensions.

To move the lines behind the boxes, just rearrange the order of the layers by using the blue arrows in the map layers window. We want the Lines layer to be at the bottom.

Set the PolygonTemplate.BackgroundColour property of the CenterBox layer to Khaki, and the set the same property of the dimension box layer to LightBlue.

Then set the PolygonTemplate.Label property of the CenterBox layer to the MeasureGroupCaption field, and set the ShowLabel property to True. If you don’t then SSRS will decide whether or not to show the label, we want it to always show.
Set the PolygonTemplate.Label property of the Dimension layer to the DimensionCaption field, and set the ShowLabel property to True.
You can then play around with font sizes, zoom, background colours, line widths etc. to get the effects that you want, but you’ll end up with a star schema visualisation similar to this.

You can also configure actions for each layer. Using this you can hyperlink each dimension box to show the CubeDoc_Dimension report for the selected dimension etc, making the star schema interactive.

This has been quite a fun blog post to investigate, I hope you can take something useful from it and have as much fun with it as I’ve had with it. Every demo that I’ve seen using spatial data has been using maps, hopefully this shows an alternative use beyond geographical mapping.

OLAP Cube Documentation in SSRS part 2

In my previous post I described how to create a number of stored procedures that use Dynamic Management Views (DMVs) to return the metadata structure of an SSAS 2008 OLAP cube, including dimensions, attributes, measure groups, BUS matrix etc.

In this post I’m going to use those procs to create a set of SSRS 2008 reports that will serve as the automated documentation of your cube. I’m going to make the following assumptions:

  • You’ve read the part 1 post, and already have the stored procs in a database.
  • You know the basics of SSRS 2008

If you haven’t read part 1, you can jump to it here.

UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.

Firstly I create a basic template report which has an appropriate header, with date/time stamp, server name and logos etc. This means that all of the reports will have a common look and feel. I could of course make use of the new report parts in SSRS 2008 R2 for this, but to maintain compatibility with pre R2 I’ll keep it simple.

The expression in the box on the top right is just to display today’s date in UK format dd/mm/yyyy.

  =FORMAT(Today(),"dd/MM/yyyy")

The reports that we’ll build will include the following:

  • CubeDoc_Cubes.rdl – Entry page, list of all cubes in the database
  • CubeDoc_Cube.rdl – showing all measure groups and dimensions within the selected cube
  • CubeDoc_Dimension.rdl – showing all hierarchies, attributes and related measure groups
  • CubeDoc_MeasureGroup.rdl – showing all measures and related dimensions
  • CubeDoc_Search.rdl – search names and descriptions of cubes, dimensions, measure groups, etc

Create the first report (CubeDoc_Cubes.rdl) which will act as the entry screen and menu of cubes.
Add a dataset dsCubes, and point it at the stored proc dbo.upCubeDocCubes

The proc has a @Catalog parameter, which filters the result set to a specific SSAS database (catalog). We want to return all cubes from all catalogs, so set the parameter value to =Nothing

All we have to do now is add a table and pull in the dataset fields that we want to see.

We can then preview the report to test that it returns the right list of cubes. You should see something like this.

Note that the AdventureWorks database doesn’t contain any descriptions, so you won’t see any in the report but they will be there when you add descriptions to your own cubes.

The next report we’re going to write is the CubeDoc_Cube report, which will list the measure groups, dimensions and BUS matrix of a single cube. We’ll link the two reports together later on.

Create a new report, using the template report you created earlier (select the template report in the solution explorer window, then CTRL+C then CTRL+V) and rename the new file as CubeDoc_Cube.rdl.

Add a report parameter called @Catalog which should be Text. I’ve set mine to default to “Adventure Works DW 2008R2” to make testing easier.

Add a dataset called dsCubes, and point it at the dbo.upCubeDocCubes proc, and link the @Catalog dataset parameter to the @Catalog report parameter.

This dataset will query the available cubes for the selected catalog, and populate a new parameter which we’ll now create, called @Cube. This should also be a text parameter, but this time we’ll set the available values to those returned by the dsCubes dataset.

If you want you can also set the default value of the parameter to the CUBE_NAME field of dsCubes. This parameter is not a multi value parameter, so by defaulting it to the dataset it will just default to the first record.

We can now use @Catalog and @Cube parameters to query the available measure groups and dimensions.
So, create a three new datasets:

  • dsDimensions – pointing to dbo.upCubeDocDimensionsInCube
  • dsMeasureGroups – pointing to dbo.upCubeDocMeasureGroupsInCube
  • dsBusMatrix – pointing to dbo.upCubeDocBUSMatrix

Set each of their @Catalog parameters to the report’s @Catalog parameter, and their @Cube parameters to the report’s @Cube parameter.

Create two tables in the report, one for measure groups and one for dimensions. Drag in the fields that you want to see, and preview the report. You should see something like this.

I’ve added a couple of textbox titles for good measure.

The third dataset, dsBUSMatrix requires something a little more interesting. For those that aren’t familiar with Kimball’s BUS Matrix concept, it’s a grid that shows the relationship and connectivity between facts (measure groups) and their dimensions. As the name suggests, we’ll use SSRS’s Matrix control for this. Once you’ve added a matrix control onto the report, follow these steps (using the dsBUSMatrix dataset):

  • Drag DIMENSION_UNIQUE_NAME onto the Rows of the matrix
  • Drag MEASUREGROUP_NAME onto the columns of the matrix
  • Right click on the Data box of the matrix, and type in the expression below. This checks the cardinality of the dimensions/measures to determine whether it is a regular relationship, a many to many or a degenerate fact
=SWITCH(SUM(Fields!Relationship.Value)=0,"",
   Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"F",
   Fields!MEASUREGROUP_CARDINALITY.Value="MANY","M",
   True,"X")

This will either show an X if there is a regular relationship, or show an M or F if there’s a many to many or degenerate relationship respectively.
To make it easier to read, I also like to set the background colour of the Data textbox to highlight the type of relationship further.

=SWITCH(SUM(Fields!Relationship.Value)=0,"Transparent",
   Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"Yellow",
   Fields!MEASUREGROUP_CARDINALITY.Value="MANY","Red",
   True,"CornflowerBlue")

If you preview the report you should see the following

It shows the concept, but it needs a little tidying up. Centering the text in the data textbox helps, but we can also use a fantastic new feature in SSRS 2008 R2 to rotate the column titles. Simply set the WritingMode property in the Localization group to Rotate270, and then shrink the width of the column.

I’ve also added a title, with a key, and level of row grouping using the DIMENSION_MASTER_NAME field, which groups role playing dimensions by their master dimension. It should now look something like this.

That’s it for this report, so save it, then go back to the first report (CubeDoc_Cubes.rdl) and right click, properties on the [CUBE_NAME] textbox. Go to the action tab, and set the action to navigate to the CubeDoc_Cube report, passing through the CATALOG_NAME and CUBE_NAME fields from the dataset as the parameter values. This sets up a hyperlink from one report to the other, allowing users to navigate around the cube doc reports by clicking on what they want to know about.

We then need to do the same for 3 other reports:
CubeDoc_Dimensions

  • dbo.upCubeDocAttributesInDimension results in a table
  • dbo.upCubeDocMeasureGroupsForDimension results in a table
  • Add an extra @Dimension parameter, populated from dbo.upCubeDocDimensionsInCube

CubeDoc_MeasureGroup

  • dbo.upCubeDocMeasuresInMeasureGroup results in a table
  • dbo.upCubeDocDimensionsForMeasureGroup results in a table
  • Add an extra @MeasureGroup parameter, populated from dbo.upCubeDocMeasureGroupsInCube

CubeDoc_Search

  • Add an extra @Search parameter, text, with no default
  • Table containing results from dbo.upCubeDocSearch, using the @Search parameter

Link all appropriate textboxes (measure groups, dimensions, search etc.) to their relevant report using the report action, and hey presto – a fully automated, real time, self-documenting cube report.

In the next and final installment of this series of blog posts, we’ll explore SQL 2008’s spatial data to generate an automated star schema visualisation to add that little something extra to the reports.

OLAP Cube Documentation in SSRS part 1

Being a business intelligence consultant, I like to spend my time designing data warehouses, ETL scripts and OLAP cubes. An unfortunate consequence of this is having to write the documentation that goes with the fun techy work. So it got me thnking, is there a slightly more fun techy way of automating the documentation of OLAP cubes…

There are some good tools out there such as BI Documenter, but I wanted a way of having more control over the output, and also automating it further so that you don’t have to run an overnight build of the documentation.

I found a great article by Vincent Rainardi describing some DMVs (Dynamic Management Views) available in SQL 2008 which got me thinking, why not just build a number of SSRS reports calling these DMVs, which would then dynamically create the cube structure documentation in real time whenever the report rendered..

This post is the first in a 3 part set which will demonstrate how you can use these DMVs to automate the SSAS cube documentation and user guide.

UPDATE: I presented a 1 hour session at SQLBits 8 covering all of this work, you can watch the video here.

There’s a full list of DMVs available in SQL 2008 R2 on the msdn site.

The primary DMVs that are of interest are:

DMV Description
MDSCHEMA_CUBES Lists the cubes in an SSAS database
MDSCHEMA_MEASUREGROUPS Lists measure groups
MDSCHEMA_DIMENSIONS Lists dimensions
MDSCHEMA_LEVELS Dimension attributes
MDSCHEMA_MEASUREGROUP_DIMENSIONS Enumerates dimensions of measure groups
MDSCHEMA_MEASURES Lists measures

When querying DMVs we can use SQL style SELECT statements, but executed against the cube in a DMX window.

SELECT *
FROM $SYSTEM.MDSCHEMA_CUBES

This returns a dataset like any other SQL query.

We can even enhance it with DISTINCT and WHERE clauses, although they are more restricted than basic SQL. One of the main limitations is the lack of a JOIN operator. A number of the queries that I’ll perform below need to use JOIN, so to get around this I wrap up each query in an SQL OPENROWSET command, executed against a SQL database with a linked server to the cube. This enables me to perform JOINs using queries such as

SELECT *
FROM OPENQUERY(CubeLinkedServer,
   'SELECT *
    FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS') mgd
INNER JOIN OPENQUERY(CubeLinkedServer,
   'SELECT *
    FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS') mg
ON mgd.XXX = mg.XXX


etc.

I’m therefore going to create a number of stored procs to wrap up this functionality, the SSRS reports can then just call the procs.

Within BIDS, every item (cube, measure group, measure, dimension, attribute, hierarchy, KPI, etc.) has a description in the properties pane which is a multi-line free text property. These are exposed by the DMVs, so I’m going to make use of them and bring them out in the reports. This allows you to create the descriptions within BIDS as you’re developing the cube, meaning they’re version controlled and always in sync with the code.

I should also point out that I’m using SQL Server 2008 R2. All of the queries below will work with SQL 2008, but I want to use the spatial report functionality of SSRS 2008 R2 to generate dynamic star schema visualisations, which is only supported in R2.

In this post I’ll script out the stored procedures used as the basis of the documentation. In my next post I’ll put these into SSRS reports.

Lets get started.

Firstly we need to create our linked server. This script will create a linked server called CubeLinkedServer pointing to the Adventure Works DW 2008R2 OLAP database on the local server.

EXEC master.dbo.sp_addlinkedserver
   @server = N'CubeLinkedServer',
   @srvproduct=N'MSOLAP',
   @provider=N'MSOLAP',
   @datasrc=N'(local)',
   @catalog=N'Adventure Works DW 2008R2'


You’ll have to set up the security according to your requirements. So now lets start creating the source procs.

The first proc lists all of the cubes. The MDSCHEMA_CUBES DMV returns not only cubes, but also dimensions, I’m filtering it to only return cubes by specifying CUBE_SOURCE=1.

CREATE PROCEDURE [dbo].[upCubeDocCubes]
  (@Catalog       VARCHAR(255) = NULL
  )
AS
  SELECT *
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT *
     FROM $SYSTEM.MDSCHEMA_CUBES
     WHERE CUBE_SOURCE = 1')
  WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
    OR @Catalog IS NULL
GO


The next proc returns all measure groups found within a specified cube.

CREATE PROCEDURE [dbo].[upCubeDocMeasureGroupsInCube]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  )
AS
  SELECT *
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT *
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS ')
  WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
    AND CAST([CUBE_NAME] AS VARCHAR(255))    = @Cube
GO


This next proc returns a list of measures within a specified measure group.

CREATE PROCEDURE [dbo].[upCubeDocMeasuresInMeasureGroup]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  ,@MeasureGroup  VARCHAR(255)
  )
AS
SELECT * FROM OPENQUERY(CubeLinkedServer,
  'SELECT *
   FROM $SYSTEM.MDSCHEMA_MEASURES
     WHERE [MEASURE_IS_VISIBLE]')
   WHERE CAST([CATALOG_NAME] AS VARCHAR(255))      = @Catalog
     AND CAST([CUBE_NAME] AS VARCHAR(255))         = @Cube
     AND CAST([MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
GO


The following proc queries all dimensions available within a specified cube. I’m filtering using the DIMENSION_IS_VISIBLE column to only show visible dimensions.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsInCube]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  )
AS
SELECT * FROM OPENQUERY(CubeLinkedServer,
  'SELECT *
   FROM $SYSTEM.MDSCHEMA_DIMENSIONS
     WHERE [DIMENSION_IS_VISIBLE]')
   WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
     AND CAST([CUBE_NAME] AS VARCHAR(255))    = @Cube
GO


Then we can query all available attributes within a dimension. This DMV returns a bitmask field (LEVEL_ORIGIN) which defines whether the attribute is a key, attribute or hierarchy. I’m using bitwise AND (&) to split this into three seperate fields for ease of use. I’m also filtering out invisible attributes, as well as those with a level of 0. Level 0 is the [All] member of any attribute, which we can ignore for this purpose.

CREATE PROCEDURE [dbo].[upCubeDocAttributesInDimension]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  ,@Dimension  VARCHAR(255)
  )
AS
  SELECT *
    , CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 1 = 1
        THEN 1 ELSE 0 END AS IsHierarchy
    , CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 2 = 2
        THEN 1 ELSE 0 END AS IsAttribute
    , CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 4 = 4
        THEN 1 ELSE 0 END AS IsKey
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT *
     FROM $SYSTEM.MDSCHEMA_LEVELS
     WHERE [LEVEL_NUMBER]>0
       AND [LEVEL_IS_VISIBLE]')
  WHERE CAST([CATALOG_NAME] AS VARCHAR(255))          = @Catalog
    AND CAST([CUBE_NAME] AS VARCHAR(255))             = @Cube
    AND CAST([DIMENSION_UNIQUE_NAME] AS VARCHAR(255)) = @Dimension
GO


The next proc returns measure groups with their associated dimensions. We have to join two DMVs together in order to get the description columns of both the dimension and measure group.

CREATE PROCEDURE [dbo].[upCubeDocMeasureGroupsForDimension]
    (@Catalog       VARCHAR(255)
    ,@Cube          VARCHAR(255)
    ,@Dimension     VARCHAR(255)
    )
AS
  SELECT
    mgd.*
    , m.[DESCRIPTION]
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT
       [CATALOG_NAME]
       , [CUBE_NAME]
       , [MEASUREGROUP_NAME]
       , [MEASUREGROUP_CARDINALITY]
       , [DIMENSION_UNIQUE_NAME]
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
       WHERE [DIMENSION_IS_VISIBLE]') mgd
   INNER JOIN OPENQUERY(CubeLinkedServer,
     'SELECT
       [CATALOG_NAME]
       ,[CUBE_NAME]
       ,[MEASUREGROUP_NAME]
       ,[DESCRIPTION]
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS') mg
        ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
           = CAST(mg.[CATALOG_NAME] AS VARCHAR(255))
        AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
           = CAST(mg.[CUBE_NAME] AS VARCHAR(255))
        AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255))
           = CAST(mg.[MEASUREGROUP_NAME] AS VARCHAR(255))
  WHERE CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))            = @Catalog
    AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))               = @Cube
    AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))   = @Dimension
GO


The next proc is similar to the above, but the opposite way around. It returns all dimensions that are related to a measure group.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  ,@MeasureGroup  VARCHAR(255)
  )
AS
  SELECT
    mgd.*
    , d.[DESCRIPTION]
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT
        [CATALOG_NAME]
       ,[CUBE_NAME]
       ,[MEASUREGROUP_NAME]
       ,[MEASUREGROUP_CARDINALITY]
       ,[DIMENSION_UNIQUE_NAME]
       ,[DIMENSION_CARDINALITY]
       ,[DIMENSION_IS_VISIBLE]
       ,[DIMENSION_IS_FACT_DIMENSION]
       ,[DIMENSION_GRANULARITY]
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
       WHERE [DIMENSION_IS_VISIBLE]') mgd
  INNER JOIN OPENQUERY(CubeLinkedServer,
    'SELECT
       [CATALOG_NAME]
       ,[CUBE_NAME]
       ,[DIMENSION_UNIQUE_NAME]
       ,[DESCRIPTION]
     FROM $SYSTEM.MDSCHEMA_DIMENSIONS
       WHERE [DIMENSION_IS_VISIBLE]') d
   ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
       = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
   AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
       = CAST(d.[CUBE_NAME] AS VARCHAR(255))
   AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
       = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
  WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))        = @Catalog
     AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))           = @Cube
     AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255))   = @MeasureGroup
GO


The next proc builds a BUS matrix, joining every dimension to its related measure groups. Later we’ll use the SSRS tablix control to pivot this into matrix form.

CREATE PROCEDURE [dbo].[upCubeDocBUSMatrix]
    (@Catalog       VARCHAR(255),
     @Cube          VARCHAR(255)
    )
AS
  SELECT
     bus.[CATALOG_NAME]
    ,bus.[CUBE_NAME]
    ,bus.[MEASUREGROUP_NAME]
    ,bus.[MEASUREGROUP_CARDINALITY]
    ,bus.[DIMENSION_UNIQUE_NAME]
    ,bus.[DIMENSION_CARDINALITY]
    ,bus.[DIMENSION_IS_FACT_DIMENSION]
    ,bus.[DIMENSION_GRANULARITY]
    ,dim.[DIMENSION_MASTER_NAME]
    ,1 AS Relationship
  FROM
    OPENQUERY(CubeLinkedServer,
      'SELECT
        [CATALOG_NAME]
        ,[CUBE_NAME]
        ,[MEASUREGROUP_NAME]
        ,[MEASUREGROUP_CARDINALITY]
        ,[DIMENSION_UNIQUE_NAME]
        ,[DIMENSION_CARDINALITY]
        ,[DIMENSION_IS_FACT_DIMENSION]
        ,[DIMENSION_GRANULARITY]
       FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
        WHERE [DIMENSION_IS_VISIBLE]') bus
    INNER JOIN OPENQUERY(CubeLinkedServer,
      'SELECT
        [CATALOG_NAME]
        ,[CUBE_NAME]
        ,[DIMENSION_UNIQUE_NAME]
        ,[DIMENSION_MASTER_NAME]
       FROM $SYSTEM.MDSCHEMA_DIMENSIONS') dim
    ON CAST(bus.[CATALOG_NAME] AS VARCHAR(255))
     = CAST(dim.[CATALOG_NAME] AS VARCHAR(255))
    AND CAST(bus.[CUBE_NAME] AS VARCHAR(255))
     = CAST(dim.[CUBE_NAME] AS VARCHAR(255))
    AND CAST(bus.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     = CAST(dim.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
  WHERE  CAST(bus.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
     AND CAST(bus.[CUBE_NAME] AS VARCHAR(255)) = @Cube
GO


Next, in order to make it easier for users to find items within the cube, I’ve created a searching proc which will scour a number of the DMVs for anything containing the search term.

CREATE PROCEDURE [dbo].[upCubeDocSearch]
    (@Search        VARCHAR(255)
    ,@Catalog       VARCHAR(255)=NULL
    ,@Cube          VARCHAR(255)=NULL
    )
AS
  WITH MetaData AS
  (
   --Cubes
    SELECT CAST('Cube' AS VARCHAR(20))            AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))     AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Cube]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME], [DESCRIPTION]
       FROM $SYSTEM.MDSCHEMA_CUBES
       WHERE CUBE_SOURCE = 1')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
       = @Catalog OR @Catalog IS NULL)

    UNION ALL

   --Dimensions
    SELECT CAST('Dimension' AS VARCHAR(20))         AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))       AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))             AS [Cube]
      , CAST(DIMENSION_NAME AS VARCHAR(255))        AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000))   AS [Description]
      , CAST(DIMENSION_UNIQUE_NAME AS VARCHAR(255)) AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME]
          , [DIMENSION_NAME], [DESCRIPTION]
          , [DIMENSION_UNIQUE_NAME]
       FROM $SYSTEM.MDSCHEMA_DIMENSIONS
         WHERE [DIMENSION_IS_VISIBLE]')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
        = @Catalog OR @Catalog IS NULL)
      AND (CAST([CUBE_NAME] AS VARCHAR(255))
        = @Cube OR @Cube IS NULL)
      AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
        <>'$' --Filter out dimensions not in a cube

    UNION ALL

   --Attributes
    SELECT CAST('Attribute' AS VARCHAR(20))         AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))       AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))             AS [Cube]
      , CAST(LEVEL_CAPTION AS VARCHAR(255))         AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000))   AS [Description]
      , CAST(DIMENSION_UNIQUE_NAME AS VARCHAR(255)) AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME]
         , [LEVEL_CAPTION], [DESCRIPTION],
         , [DIMENSION_UNIQUE_NAME]
       FROM $SYSTEM.MDSCHEMA_LEVELS
       WHERE [LEVEL_NUMBER]>0
         AND [LEVEL_IS_VISIBLE]')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
         = @Catalog OR @Catalog IS NULL)
      AND (CAST([CUBE_NAME] AS VARCHAR(255))
         = @Cube OR @Cube IS NULL)
      AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
         <>'$' --Filter out dimensions not in a cube

    UNION ALL

   --Measure Groups
    SELECT CAST('Measure Group' AS VARCHAR(20))   AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))     AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Cube]
      , CAST(MEASUREGROUP_NAME AS VARCHAR(255))   AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
      , CAST(MEASUREGROUP_NAME AS VARCHAR(255))   AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
       'SELECT [CATALOG_NAME], [CUBE_NAME]
          , [MEASUREGROUP_NAME],
          , [DESCRIPTION]
        FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
       = @Catalog OR @Catalog IS NULL)
     AND (CAST([CUBE_NAME] AS VARCHAR(255))
       = @Cube OR @Cube IS NULL)
     AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
       <>'$' --Filter out dimensions not in a cube

    UNION ALL

   --Measures
    SELECT CAST('Measure' AS VARCHAR(20))         AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))     AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Cube]
      , CAST(MEASURE_NAME AS VARCHAR(255))        AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
      , CAST(MEASUREGROUP_NAME AS VARCHAR(255))   AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME]
         , [MEASURE_NAME], [DESCRIPTION],
         , [MEASUREGROUP_NAME]
       FROM $SYSTEM.MDSCHEMA_MEASURES
          WHERE [MEASURE_IS_VISIBLE]')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
          = @Catalog OR @Catalog IS NULL)
      AND (CAST([CUBE_NAME] AS VARCHAR(255))
          = @Cube OR @Cube IS NULL)
      AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
          <>'$' --Filter out dimensions not in a cube

    )
    SELECT *
    FROM MetaData
    WHERE @Search<>''
        AND ([Name] LIKE '%' + @Search + '%'
          OR [Description] LIKE '%' + @Search + '%'
        )
GO


We can now use these procs to form the basis of a number of SSRS reports which will dynamically query the DMVs to generate the SSAS cube documentation. I’ll be covering this stage in my next post.

 

Calculate Run Rate (Full Year Projection) in MDX

This post explains how to create an MDX calculated member that will take a value from the cube and project it forward to the end of the year. This provides a simple mechanism for calculating what your expected total will be at year end, based upon current performance.

To do this more accurately you should use time series data mining models in SSAS and use DMX expressions to query the results, but this method is very simple and requires little effort, and will be pretty accurate so long as the data you’re modelling is fairly linear. Please note though that the more cyclical and seasonal your data is the less effective this will be.

The basic idea is that we take what we have done so far (i.e. year to date sales), look at how far through the year we are, and extrapolate the value of future months (or days/weeks/etc.) based upon values so far.

i.e. If we’re at March month end and we’ve sold 100 widgets so far this year, we’re 1/4 of the way through the year so we multiply 100 by 4 and get a prejected yearly total of 400.


This chart shows the concept of what we’re doing, and shows the full year prejections calculated in March (with 3 months of available data) and June (6 months of data). The projections obviously get more accurate the further you are through the year.

One of the points to note is that when creating a calculation like this, based upon a time dimension, the calculation should always work with any level of the dimension hierarchy selected. i.e. The user shouldn’t care whether they’re looking at a month, week, quarter or a day, the calculation should always work the same. To achieve this we simply use the .currentmember of the time hierarchy.

The following examples are based upon projecting the Internet Sales Amount measure found within the SQL Server 2008 Adventure Works DW sample cube.

Step 1 – What are our total sales so far this year?

MDX helpfully provides us with the YTD function which takes care of this for us.


  MEMBER [Measures].[YTD Sales] AS
    AGGREGATE(
      YTD([Date].[Calendar].CurrentMember)
      ,[Measures].[Internet Sales Amount])

This takes the current member of the Calendar hierarchy, and creates a set of all dates before it (this year) using YTD. It then aggregates (in this case sums) the Internet Sales Amount for all of these dates to calculate YTD Sales.

Step 2 – Which period are we in?

Here we’ll use the same YTD function to create a set of all dates so far this year, but in this case we’ll count the number of resulting members. Note that because we’re using the .CurrentMember of the hierarchy, it doesn’t matter if we’re looking at a date, week or month, the MDX will work. i.e. If we’re looking at 21 Jan it will return 21. If we’re looking at Q3 it will return 3, August will return 8 etc.


  MEMBER [Measures].[CurPeriod] AS
    COUNT(
      YTD([Date].[Calendar].CurrentMember)
      ,INCLUDEEMPTY)

Step 3 – How many periods are in the year?

If we coded this to only work with months then we could hard code this to 12 however we need to keep it generic to all levels of the hierarchy. So, we have to count all the cousins of the current time member [within this year]. Unfortunately there isn’t a Cousins function in MDX, and Siblings will only return other members within the same parent. i.e. siblings of May 4th would include May 1 through to May 31. To get around this we find the year of the current member by using the Ancestor function.


  ANCESTOR([Date].[Calendar].CurrentMember
  , [Date].[Calendar].[Calendar Year])

Then we find all of the descendants of the year, at the same level of the hierarchy (week/day/etc.) as the current member. We can then take a count as before.


  MEMBER [Measures].[TotalPeriods] AS
    COUNT(
      DESCENDANTS(
        ANCESTOR([Date].[Calendar].CurrentMember
          ,[Date].[Calendar].[Calendar Year])
        ,[Date].[Calendar].CurrentMember.level)
      ,INCLUDEEMPTY)

Step 4 – Calculate the Run Rate

Calculating the prejected yearly total (run rate) is then a simple calculation


  MEMBER [Measures].[Full Year Run Rate] AS
    [Measures].[YTD Sales]
    * ([Measures].[TotalPeriods]
       /[Measures].[CurPeriod])

You can then put the whole lot together and see the results…


WITH

  MEMBER [Measures].[YTD Sales] AS
    AGGREGATE(
      YTD([Date].[Calendar].CurrentMember)
      ,[Measures].[Internet Sales Amount])

  MEMBER [Measures].[CurPeriod] AS
    COUNT(
      YTD([Date].[Calendar].CurrentMember)
      ,INCLUDEEMPTY)

  MEMBER [Measures].[TotalPeriods] AS
    COUNT(
      DESCENDANTS(
        ANCESTOR([Date].[Calendar].CurrentMember
          ,[Date].[Calendar].[Calendar Year])
        ,[Date].[Calendar].CurrentMember.level)
      ,INCLUDEEMPTY)

  MEMBER [Measures].[Full Year Run Rate] AS
    [Measures].[YTD Sales]
    * ([Measures].[TotalPeriods]
       /[Measures].[CurPeriod])

SELECT
{
     [Measures].[Internet Sales Amount]
    ,[Measures].[YTD Sales]
    ,[Measures].[Full Year Run Rate]
    ,[Measures].[CurPeriod]
    ,[Measures].[TotalPeriods]
} ON 0,
{
    DESCENDANTS([Date].[Calendar].[CY 2003])
} ON 1
FROM [Direct Sales]

In my next blog I’ll be diong the same calculation in DAX for use with PowerPivot, stay tuned…

Frog-Blog Out

Excel Cube Pivot drillthrough limited to 1000 rows

When browsing a cube using Excel 2007, you can drillthrough the measures to display up to 1000 rows of the transaction level source data.

I often get asked whether this limit of 1000 rows is configurable – well the good news is yes it is.

There is an option in the actions tab of the BIDS cube designer which allows you to specify the maximum rows, but helpfully this is ignored by Excel. Instead, you have to set it in Excel when you create a pivot.

Just click “Options” on the “PivotTable Tools” ribon, then in the “Change Data Source” dropdown click on “Connection Properties“. In this screen, just change the “Maximum number of records to retrieve” property.

Excel 2007 Pivot Options

The Frog Blog

I'm Alex Whittles.

I specialise in designing and implementing SQL Server business intelligence solutions, and this is my blog! Just a collection of thoughts, techniques and ramblings on SQL Server, Cubes, Data Warehouses, MDX, DAX and whatever else comes to mind.

Data Platform MVP

Frog Blog Out
twitter
rssicon