0845 643 64 63

Analysis Services

Querying dimension members in MDX

Today we’re back looking at MDX with a quick how-to. How do you extract a list of dimension members using MDX?

This could be used for example to populate a user interface/parameter of a report/dashboard.

Let’s look at two scenarios; normal dimensions, and then hierarchies. All the queries here use the “Adventure Works DW 2008 R2” “Adventure Works” cube.

Standard dimension attributes

So how do we get a list of dimension members? Well lets divide the query up into rows and columns. We want the members listed on the rows.

1
2
3
4
SELECT
  xxx ON COLUMNS,
  [Geography].[Country].ALLMEMBERS ON ROWS
FROM [Adventure Works]

Why use .ALLMEMBERS instead of .MEMBERS? .ALLMEMBERS will include calculated members, whereas .MEMBERS won’t.

So what do we put on the rows? We want the display name for showing to the user, and the unique name so we can refer back to each exact member. We can get these by using properties of the current member, namely MEMBER_CAPTION and UNIQUENAME. Get a full list of dimension member properties here. To make this work, we just define a number of query-scoped calculated members using WITH, and then include them in the columns set…

1
2
3
4
5
6
7
8
WITH 
   MEMBER [Measures].[Label] AS [Geography].[Country].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Geography].[Country].CURRENTMEMBER.UNIQUENAME 
SELECT {[Measures].[Label], 
        [Measures].[UniqueName]
        } ON COLUMNS , 
      [Geography].[Country].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

If you run this, you get the following output:

This may be what you want, but note that the first row of the result set contains the ‘ALL’ member, which you probably don’t want. You can remove this by altering the ROWS clause to exclude them. This is simply a case of repeating the hierarchy name, e.g. [Geography].[Country].[Country].ALLMEMBERS

1
2
3
4
5
6
7
8
WITH 
   MEMBER [Measures].[Label] AS [Geography].[Country].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Geography].[Country].CURRENTMEMBER.UNIQUENAME 
SELECT {[Measures].[Label], 
        [Measures].[UniqueName]
        } ON COLUMNS , 
      [Geography].[Country].[Country].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

Hierarchies

With simple dimension attributes, you get a flat list of values. With hierarchies, whether standard user hierarchies, ragged or parent-child, you need to return a multi-level list. To do this we need to know what level each member is at. We can query this using the LEVEL.ORDINAL property. Adding this into the query, and replacing Geography for Employee, we get the following query:

1
2
3
4
5
6
7
8
9
10
WITH 
   MEMBER [Measures].[Label] AS [Employee].[Employees].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Employee].[Employees].CURRENTMEMBER.UNIQUENAME 
   MEMBER [Measures].[Ordinal] AS [Employee].[Employees].CURRENTMEMBER.LEVEL.ORDINAL 
SELECT {[Measures].[Label], 
        [Measures].[UniqueName], 
        [Measures].[Ordinal]
        } ON COLUMNS , 
      [Employee].[Employees].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

With the following results:

You can then do all sorts of funky things using basic MDX navigation functions. e.g. returning the parent of each member, by simply adding …PARENT.UNIQUENAME as another column

1
2
3
4
5
6
7
8
9
10
11
12
WITH 
   MEMBER [Measures].[Label] AS [Employee].[Employees].CURRENTMEMBER.MEMBER_CAPTION 
   MEMBER [Measures].[UniqueName] AS [Employee].[Employees].CURRENTMEMBER.UNIQUENAME 
   MEMBER [Measures].[Ordinal] AS [Employee].[Employees].CURRENTMEMBER.LEVEL.ORDINAL 
   MEMBER [Measures].[Parent] AS [Employee].[Employees].PARENT.UNIQUENAME
SELECT {[Measures].[Label], 
        [Measures].[UniqueName], 
        [Measures].[Ordinal],
        [Measures].[Parent]
        } ON COLUMNS , 
      [Employee].[Employees].ALLMEMBERS ON ROWS 
FROM [Adventure Works]

Now go forth and query those dimension members…
FrogBlog-Out

Find first order date in MDX

I had an interesting question the other day; “how do I find the date of the first order/activity for a given customer/employee/product etc in MDX”?

Well you’ll be pleased to know that it’s pretty simple. The sample code bellow will work with the AdventureWorks DW 2008R2 cube.

First of all we need to filter the date dimension to contain only those with orders. Then take the first item of the resulting set, then find its name. Simples.

Filter the date hierarchy to only those dates with a Reseller Order Count:

   FILTER([Date].[Date].[Date].MEMBERS, [Measures].[Reseller Order count])

Note that you can specify a more advanced filter, such as [Measures].[Reseller Order Count]>10, which would find the first date which had more than 10 orders.

Then find the first item:

   .ITEM(0)

Then find the its name:

   .NAME

Put this together in a query and you get:

   WITH MEMBER [Measures].[First Activity] AS
       FILTER([Date].[Date].[Date].MEMBERS
            , [Measures].[Reseller Order count]).ITEM(0).NAME
   SELECT {[Measures].[Reseller Order count]
         , [Measures].[First Activity]
       } ON 0,
      [Employee].[Employees].MEMBERS ON 1
   FROM [Adventure Works]

This returns the total reseller order count for each employee, along with the date of their first reseller order. Note that the Employee hierarchy here is a parent child hierarchy. The calculated member [First Activity] will aggregate and calculate correctly with any dimension or member specified on the 1 (ROWS) axis, be it a parent-child, single attribute, normal hierarchy, etc. and will always find the first order date for the current member.
You should get the following results:

Frog-Blog Out.

PowerPivot Vs QlikView 101 – SQLBits Video

PowerPivot Vs QlikView 101 – SQLBits Video

The video is now available from my PowerPivot and QlikView talk at SQLBits 9 in Liverpool in September 2011. You can download or watch the video here.

In this 1 hour session I create interactive dashboards from scratch in both PowerPivot and QlikView, showing how to set up the data model, overcome common pitfalls and build the dashboards. I create almost identical dashboards in both systems and highllight the pros and cons of each system.

PowerPivot QlikView

You can find the scripts and code to go with this video in this blog post

SQLBits 10 is coming!!

Today the SQLBits organisers have announced that SQLBits 10 will be held in London between 29th – 31st March 2012 in the Novotel London West. It’s going to be even bigger and better, so keep an eye on the website and get your tickets early.

Video: Automating SSAS OLAP Cube documentation

Automating OLAP cube documentation – SQLBits presentation

For anyone that missed my presentation at SQLBits 8 in April, the video is now available here.

In this 1 hour session I present a method of automating the creation of documentation for SSAS OLAP cubes by using DMVs (dynamic management views) and spatial data, querying the metadata of the cube in realtime.

The results include the BUS matrix, star schemas, attribute lists, hierarchies etc. and are all presented in SSRS.

The blog posts to go with this are here:

You can view the slide deck here

The Business Intelligence Semantic Model (BISM)

I’m a happy chap. Why? Because I read a blog post yesterday by T.K. Anand (SSAS Principal Group Program Manager) about the vision and roadmap of Analysis Services.

There were slightly concerning questions last November (following the PASS conference) surounding the future of Analysis Services, or more specifically the UDM, the dimensional model that we all know and love in SSAS 2005 & 2008. The arrival of PowerPivot into the Microsoft BI arsenal has without a doubt moved the goalposts and added significant power, flexibility and usability to the BI stack. My concern, along with others (most notibly Chris Webb, who sparked somewhat of a stampede on the issue), was for the future of the UDM and the multitude of existing dimensional systems out in the field. Is the dimensional approach being phased out? Will it be supported in future editions? Will it be enhanced? Will the future BISM support the complexity and power we currently have with the UDM?

There’s no doubt that the overall approach to business intelligence is evolving. And this isn’t just in the cube space, it obviously has a direct effect on all other aspects of the BI strategy; the data warehouse, reporting layer, ETL etc.

From a BI consultant’s point of view, I don’t want to be recommending tools to my clients which have a restricted life span and don’t provide them a future proof upgrade path.

From a technology perspective, I’m a hardened supporter of the dimensional model. I recently designed a complex cube system for a banking client which had over 150 dimensions and facts, with thousands of lines of MDX to create a very sophisticated calculation framework for their liquidity modelling and loan profiling. I wouldn’t dream of doing that in a tabular approach like PowerPivot (in their current form).

From a personal point of view, where do I focus my attention in terms of training, research, blogging, user groups, conference sessions etc. etc.

I should point out that I’m very excited by, and fully committed to the tabular/PowerPivot route (along with VertiPaq, Crescent, DAX, etc.) for systems that it is suited to. In fact I’m using it right now to prototype a global BI solution for a very large client. There are however some solutions that do not fit well with the tabular approach and are best suited to a dimensional approach. I’m in favour of a hybrid framework which allows the right tool to be used for the right system. And it looks like that’s what we’re going to get.

The guys at Microsoft have now evolved and clarified the roadmap, and have confirmed that the BISM (business intelligence semantic model, i.e. The core of Analysis Services in SQL Server Denali) will contain two parallel approaches that can both be used for whichever situations they are best suited to. More importantly, they are both here to stay, will both be developed further, and there will be a cross-availability of functionality and tools between them.

Multi Dimensional Model

Essentially the same as the existing UDM, the multi-dimensional data model will support MDX and ROLAP/MOLAP data access. Existing OLAP cubes in SQL 2008 will easily upgrade to this.

Tabular Model

Think of this as hosted PowerPivot. A tabular approach with a column based data store, DAX as the expression language and either VertiPaq or Direct Query for data access.

The two will co-exist side by side within a singluar BISM, albeit initially with a degree of seperation. In the upcoming CTP2 release (July 2011?) there will not be any cross-availability of functionality, i.e. VertiPaq, Crescent and DAX will not be available to the dimensional model. However TK makes it clear that this is a short term restriction in the CTP, and that Microsoft are commited to getting this cross availability in place in the finished product.

If you’re involed in BI in any way, I really do encourage you to go and read TK’s post in detail. The Business Intelligence world is changing. I now have total confidence that it’s for the better.

Frog-blog-out

Debug MDX queries using Drillthrough in SSMS

One of the great features of using Excel to browse an SSAS OLAP cube is the drillthrough ability. If you double click on any cell of an OLAP pivot table, Excel will create a new worksheet containing the top 1000 fact records that went to make up the figure in the selected cell.

N.B. The limit of 1000 rows can be altered, as per one of my previous blog posts here.

This feature is pretty well known, but not many folk realise how easy it is to reproduce this in SQL Server Management Studio (SSMS). All you need to do is prefix your query with DRILLTHROUGH.

i.e. Assuming an MDX query of

SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

Which returns the following results…

A query of

DRILLTHROUGH
SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

Returns the records contributing to the total figure. Great for diagnosing problems with an MDX query.

By default, only the first 10,000 rows are returned, but you can override this using MAXROWS

DRILLTHROUGH MAXROWS 500
SELECT [Measures].[Internet Sales Amount] ON 0
FROM [Adventure Works]
WHERE [Date].[January 1, 2004]

The columns that are returned are those defined in the Actions tab of the Cube Designer in BIDS (The Business Intelligence Development Studio).

If no action is defined, then the fact measures will be returned along with the keys that link to each relevant dimension, which tend not to be that helpful.

You can override the returned columns by using the RETURN clause

DRILLTHROUGH SELECT [Measures].[Internet Sales Amount] ON 0 FROM [Adventure Works] WHERE [Date].[January 1, 2004] RETURN [$Internet Sales Order Details].[Internet Sales Order] ,[$Sales Territory].[Sales Territory Region] ,NAME([$Product].[Product]) ,KEY([$Product].[Product]) ,UNIQUENAME([$Product].[Product]) ,[Internet Sales].[Internet Sales Amount] ,[Internet Sales].[Internet Order Quantity]


Note that there are some restrictions on what you can drill through

  • You can’t drill through an expression/calculation, only a raw measure
  • The MDX query needs to return a single cell (otherwise the cube does not know which one to drill through)
  • The data returned will be at the lowest granularity of the cube’s fact table

To explain the last point further, the cube does not return the raw data from the underlying data warehouse, but a summary of the facts grouped by unique combination of the relevant dimensions. i.e. if a warehouse table containing individual sales (by date, product, customer & store) is brought into a cube as a fact table that only has relationships with the date and product dimensions, then the cube drill through will return unique combinations of date and product, summarising sales for each combination. Extra granularity which the warehouse may contain (customer and store) will not be available.

Note that if you specify the RETURN columns, the rows are still returned at the lowest level of the fact table granularity, even if not all of the dimensions are brought out as columns. This may result in returning multiple identical records. Don’t worry, these will be distinct facts, just differentiated by a dimension/attribute that isn’t being returned.

You can find out more on TechNet here

Frog-Blog Out

How to add calculations in SSAS cubes

One of the most useful aspects of a Business Intelligence system is the ability to add calculations to create new measures. This centralises the logic of the calculation into a single place, ensuring consistency and standardisation across the user base.

By way of example, a simple calculation for profit (Income – Expenditure) wouldn’t be provided by the source database and historically would be implemented in each and every report. In a data warehouse and/or cube we can create the calculation in a single place for everyone to use.

This post highlights some of methods of doing this, each with their respective pros and cons.

Calculated Members in SSAS Cube


SSAS provides a ‘Calculations’ tab in the cube designer which allows you to create new measures using MDX. You can use any combination of existing measures and dimension attributes, along with the plethora of MDX functions available to create highly complex calculations.
Pros:

  • Very complex calculations can be created using all available MDX functions
  • No changes are required to the structure of the data warehouse
  • Changes to the calculation will apply to every record, historic and new
  • The results are not stored in the warehouse or cube, so no extra space is required
  • New calculations can be added without having to deploy or reprocess the cube
  • Calculations can be scoped to any level of aggregation and granularity. Different calculations can even be used for different scopes
  • Calculations can easily combine measures from different measure groups

Cons:

  • The calculation will not make use of SSAS cube aggregations, reducing performance
  • SSAS drill through actions will not work
  • The calculation results are not available in the data warehouse, only the cube

SQL Calculations in the Data Source View


There’s a layer in-between the data warehouse and the cube called the data source view (DSV). This presents the relevant tables in the warehouse to the cube, and can be used to enhance the underlying data with calculations. This can either be the dsv layer within the cube project, or I prefer to create SQL Server views to encapsulate the logic.
Pros:

  • No changes are required to the table structure of the data warehouse
  • Calculations use SQL not MDX, reducing the complexity
  • Changes to the calculation will apply to every record, historic and new
  • The calculation will make full use of SSAS cube aggregations
  • SSAS drill through actions will work
  • The results are not stored in the warehouse, so the size of the database does not increase

Cons:

  • The cube must be redeployed and reprocessed before the new measure is available
  • The results of the calculation must be valid at the granularity of the fact table
  • The calculation results are not available in the data warehouse, only the cube

Calculate in the ETL process


Whilst bringing in data from the source data systems, it sometimes makes sense to perform calculations on the data at that point, and store the results in the warehouse.
Pros:

  • The results of the calculation will be available when querying the warehouse as well as the cube
  • In the ETL pipeline you can import other data sources (using lookups etc.) to utilise other data in the calculation
  • If the calculation uses time based data, or data valid at a specific time (i.e. share price) then by performing the calculation in the ETL, the correct time based data is used, without having to store the full history of the underlying source data
  • The calculation will make full use of SSAS cube aggregations
  • SSAS drill through actions will work

Cons:

  • You have to be able to alter the structure of the data warehouse, which isn’t always an option.
  • The results are stored in the warehouse, increasing the size of the database
  • The results of the calculation must be valid at the granularity of the fact table
  • If the calculation logic changes, all existing records must be updated

In Conclusion

If the calculation is valid for a single record, and it would be of benefit to have access to the results in the warehouse, then perform the calculation in the ETL pipeline and store teh results in the warehouse.

If the calculation is valid for a single record, and it would not be of benefit to have the results in the warehouse, then calculate it in the data source view.

If the calculation is too complex for SQL, requiring MDX functions, then create an MDX calculated measure in the cube.

Deploying MDX calculation scripts with xmla

If you’re a Business Intelligence developer I assume you have BIDS Helper installed. If not then stop reading this post and go and install it. Now. It adds a number of very useful features to the Business Intelligence Development Studio which provide help with many aspects of SSIS, SSRS and SSAS development.

One of my favourite utilities is the Deploy MDX Script function. This takes the calculation script for an SSAS cube (named sets, calculated measures, scope logic, etc.) and deploys it in isolation without having to redeploy and rebuild the entire cube. This is a life saver when trying to write and test complex MDX calculations, and has saved me days if not weeks of waiting around.

The Deploy MDX Script button works perfectly when deploying updated script to the development environment, but what if you want to deploy the same script changes to a testing or live environment? Is there a way of scripting the change without redeploying the entire cube?

Yes there is, by using the following xmla script. Just change the DatabaseID and CubeID elements of the Object element to point to your Analysis Services database and cube, and paste your MDX calculation script in between the <Text> and </Text> tags. Run the script in SQL Server Management Studio and it should update the cube with the new script.

This script works for SQL Server 2008 and SQL Server 2008 R2.

<Alter 
       AllowCreate="true" 
       ObjectExpansion="ExpandFull" 
       xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" 
       xmlns:as="http://schemas.microsoft.com/analysisservices/2003/engine">
   <Object>
      <DatabaseID>YourDatabaseName</DatabaseID>
      <CubeID>YourCubeName</CubeID>
      <MdxScriptID>MdxScript</MdxScriptID>
   </Object>
   <ObjectDefinition>
      <MdxScript>
         <ID>MdxScript</ID>
         <Name>MdxScript</Name>
         <Commands 
            xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
            xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" 
            xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2" 
            xmlns:ddl100_100="http://schemas.microsoft.com/analysisservices/2008/engine/100/100" 
            xmlns:dwd="http://schemas.microsoft.com/DataWarehouse/Designer/1.0">
            <Command>
               <Text>
/*
The CALCULATE command controls the aggregation of leaf cells in the cube.
If the CALCULATE command is deleted or modified, the data within the cube is affected.
You should edit this command only if you manually specify how the cube is aggregated.
*/
CALCULATE;

------------------------------------------------
--Paste your MDX Calculations here
------------------------------------------------

               </Text>
            </Command>
         </Commands>
      </MdxScript>
   </ObjectDefinition>
</Alter>

And there you have it, you can update your MDX calculated members outside of BIDS without doing a full deploy.

Frog-Blog Out

OLAP Cube Documentation in SSRS part 3

This is the 3rd and final post in this series of blog posts, showing how you can use SQL Server Reporting Services (SSRS), DMVs and spatial data to generate real time automated user guide documentation for your Analysis Services (SSAS) OLAP cube.

UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.

In this post, I’m going to enhance the measure group report to include a visualisation of the star schema. To do this I’ll be enhancing one of the stored procedures to utilise SQL Server 2008’s new spatial data types combined with SSRS 2008 R2’s new map functionality.

To do this we’ll update the dbo.upCubeDocDimensionsForMeasureGroup stored proc so that it returns a SQL geometry polygon for each row, in the right place around the circumference of the star. There’s a little math in this, but nothing more than a bit of trigonometry.

First the theory. We have an arbitrary number of dimensions that we need to place in a circle around a central point (the measure group). If we have 6 dimensions, then we need to divide the whole circle (360 degrees) by 6 (=60 degrees each) to get the angle of each around the hypothetical axis.

Therefore the first dimension needs to be at 60, the second at 120, the third at 180 etc, with the 6th at 360, completing the full circle.
Obviously the angle needs to vary depending on the number of dimensions in the query, so we need to calculate it within the stored proc. To do this I’m using common table expressions (CTE) to perform further calculations on the basic query.

We wrap the original proc query into a CTE and call it BaseData. We also add an extra field called Seq, which uniquely identifies each row, we’ll use this later to enable us to rank the dimensions.

;WITH BaseData AS
(
    SELECT
          mgd.*
        , d.[DESCRIPTION]
        , REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS DimensionCaption
        , REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS MeasureGroupCaption
    FROM OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                   +[CUBE_NAME]
                   +[MEASUREGROUP_NAME]
                   +[DIMENSION_UNIQUE_NAME] AS Seq
                , [CATALOG_NAME]
                , [CUBE_NAME]
                , [MEASUREGROUP_NAME]
                , [MEASUREGROUP_CARDINALITY]
                , [DIMENSION_UNIQUE_NAME]
                , [DIMENSION_CARDINALITY]
                , [DIMENSION_IS_VISIBLE]
                , [DIMENSION_IS_FACT_DIMENSION]
                , [DIMENSION_GRANULARITY]
            FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') mgd
        INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                ,[CUBE_NAME]
                ,[DIMENSION_UNIQUE_NAME]
                ,[DESCRIPTION]
            FROM $SYSTEM.MDSCHEMA_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') d
                ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
                  = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
                AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
                  = CAST(d.[CUBE_NAME] AS VARCHAR(255))
                AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
                  = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))     = @Catalog
       AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))         = @Cube
       AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)

We’ll then add a new CTE which calculated the number of records returned by the previous query.

,TotCount AS
(
    SELECT COUNT(*) AS RecCount FROM BaseData
)

Next we cross join TotCount with the base data, so that every row has the extra RecCount field. We then rank each record, providing each with a unique number from 1 to n.

, RecCount AS
(
    SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
        , RecCount
        , BaseData.*
    FROM
        BaseData CROSS JOIN TotCount
)

Each record now contains its row number, as well as the total number of rows, so it’s easy to calculate its position around the circle (rank/n * 360). Now we have that, calculating the x and y coordinates of each dimension is simply a case of applying Sine and Cosine. Note that the SQL SIN and COS functions expect angles to be provided in radians not degrees, so we have to use the RADIANS function to convert it for us. I’m also multiplying the result by 1000 to scale the numbers up from -1 to +1 to -1000 to +1000, which makes our life easier later on.

, Angles AS
(
    SELECT
        *
        , SIN(RADIANS((CAST(RecID AS FLOAT)
            /CAST(RecCount AS FLOAT))
            * 360)) * 1000 AS x
        , COS(RADIANS((CAST(RecID AS FLOAT)
            /CAST(RecCount AS FLOAT))
            * 360)) * 1000 AS y
    FROM RecCount
)

We can now use the x and y coordinates to create a point indicating the position of each dimension, using the code below.

geometry::STGeomFromText('POINT('
   + CAST(y AS VARCHAR(20))
   + ' '
   + CAST(x AS VARCHAR(20))
   + ')',4326) AS Posn

This is a good start, but we want a polygon box, not a single point. We can use a similar geometry function to create a polygon around our point.

geometry::STPolyFromText('POLYGON ((' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
 	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
      + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
   ))',0) AS Box

You’ll notice that I’m multiplying the y axis by a @Stretch variable. This is to allow us to squash or squeeze the resulting star to make it look better in the report. I’m also using a @BoxSize variable which we can use to change the relative size of the boxes. It’s for this reason why I like to work on a -1000 to +1000 scale, it means we can have an integer box size of say 250 instead of a fraction such as 0.25, I just think it’s easier to read.
So you’ll now have a stored proc similar to this.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
    (@Catalog       VARCHAR(255)
    ,@Cube          VARCHAR(255)
    ,@MeasureGroup  VARCHAR(255)
    )
AS

 DECLARE @BoxSize INT
 DECLARE @Stretch FLOAT
 SET @BoxSize = 250
 SET @Stretch = 1.4

;WITH BaseData AS
(
    SELECT
          mgd.*
        , d.[DESCRIPTION]
        , REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS DimensionCaption
        , REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS MeasureGroupCaption
    FROM OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                   +[CUBE_NAME]
                   +[MEASUREGROUP_NAME]
                   +[DIMENSION_UNIQUE_NAME] AS Seq
                , [CATALOG_NAME]
                , [CUBE_NAME]
                , [MEASUREGROUP_NAME]
                , [MEASUREGROUP_CARDINALITY]
                , [DIMENSION_UNIQUE_NAME]
                , [DIMENSION_CARDINALITY]
                , [DIMENSION_IS_VISIBLE]
                , [DIMENSION_IS_FACT_DIMENSION]
                , [DIMENSION_GRANULARITY]
            FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') mgd
        INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                ,[CUBE_NAME]
                ,[DIMENSION_UNIQUE_NAME]
                ,[DESCRIPTION]
            FROM $SYSTEM.MDSCHEMA_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') d
                ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
                  = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
                AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
                  = CAST(d.[CUBE_NAME] AS VARCHAR(255))
                AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
                  = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))      = @Catalog
        AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))         = @Cube
        AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)
,TotCount AS
(
    SELECT COUNT(*) AS RecCount FROM BaseData
)
, RecCount AS
(
    SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
        , RecCount
        , BaseData.*
    FROM
        BaseData CROSS JOIN TotCount
)
, Angles AS
(
    SELECT
        *
        , SIN(RADIANS((CAST(RecID    AS FLOAT)
             /CAST(RecCount AS FLOAT))
             * 360)) * 1000 AS x
        , COS(RADIANS((CAST(RecID AS FLOAT)
             /CAST(RecCount AS FLOAT))
             * 360)) * 1000 AS y
    FROM RecCount
)
,Results AS
(
    SELECT
        *
        , geometry::STGeomFromText('POINT('
            + CAST(y AS VARCHAR(20))
            + ' '
            + CAST(x AS VARCHAR(20))
            + ')',4326) AS Posn
        , geometry::STPolyFromText('POLYGON ((' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS Box
    FROM Angles
)
SELECT * FROM Results
GO

If you then execute this in Management Studio, you’ll notice an extra tab in the result window called Spatial Results.

EXEC [dbo].[upCubeDocDimensionsForMeasureGroup]
    @Catalog = 'Adventure Works DW 2008R2',
    @Cube = 'Adventure Works',
    @MeasureGroup = 'Financial Reporting'

Click on the Spatial Results tab, then select Box as the spatial column, and you’ll see the boxes that we’ve created in a preview window.

This is now getting somewhere close. But as well as the dimensions, we also want to show the measure group in the middle, as well as lines linking them together to actuallly create our star. We can do this by adding a couple more geometry functions to our query. We end up with the end of our proc looking like this.

,Results AS
(
    SELECT
        *
        , geometry::STGeomFromText('POINT('
            + CAST(y AS VARCHAR(20))
            + ' '
            + CAST(x AS VARCHAR(20))
            + ')',4326) AS Posn
        , geometry::STPolyFromText('POLYGON ((' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS Box
         , geometry::STLineFromText('LINESTRING (0 0, '
              + CAST((y*@Stretch) AS VARCHAR(20))
              + ' ' + CAST(x AS VARCHAR(20))
              + ')', 0) AS Line
         , geometry::STPolyFromText('POLYGON ((' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS CenterBox

    FROM Angles
)
SELECT * FROM Results
GO

So, we’ve now got the polygons and lines being generated by the proc, it’s now time to add them into the report and display them to our users.

Firstly open up the CubeDoc_MeasureGroup.rdl report, and go to the properties of the dsDimensions dataset. Click Refresh Fields and the new x, y, Posn, Box, Line and CenterBox fields should now be available.
Then drag a Map from the toolbox onto the report. This will start the map wizard.
Select SQL Server Spatial Query as the source for the data and click Next.
Choose dsDimensions as the dataset to use for the data and click Next.
Choose Box as the spatial field, and Polygon as the layer type. It may well give you an error, just ignore it.
Don’t select the embed map or Bing maps layer.

Click Next, then select a basic map, select the default options for the remaining wizard stages and you’ll end up with a map in your report.

If you preview the report at this stage you won’t see the polygons. This is because the map still thinks it’s a geographical map, and it is trying to draw our boxes as latitude and longitudes. We don’t want this, but want it to show them on our own scale. To fix this, just change the CoordinateSystem property of the map from Geographic to Planar.

You can then preview the report, which should show you something like this

It still doesn’t look like a star but we’ve still got a few more changes to make. We need to add a couple more layers to the map, the center box for the measure group and then the lines to link them all together.
Add a new polygon layer to the map and then set the layer data to use the CenterBox field from the dsDimensions dataset.

Repeat the above, but with a new line layer instead of a polygon layer. Set the layer data to use the Line field of dsDimensions.

To move the lines behind the boxes, just rearrange the order of the layers by using the blue arrows in the map layers window. We want the Lines layer to be at the bottom.

Set the PolygonTemplate.BackgroundColour property of the CenterBox layer to Khaki, and the set the same property of the dimension box layer to LightBlue.

Then set the PolygonTemplate.Label property of the CenterBox layer to the MeasureGroupCaption field, and set the ShowLabel property to True. If you don’t then SSRS will decide whether or not to show the label, we want it to always show.
Set the PolygonTemplate.Label property of the Dimension layer to the DimensionCaption field, and set the ShowLabel property to True.
You can then play around with font sizes, zoom, background colours, line widths etc. to get the effects that you want, but you’ll end up with a star schema visualisation similar to this.

You can also configure actions for each layer. Using this you can hyperlink each dimension box to show the CubeDoc_Dimension report for the selected dimension etc, making the star schema interactive.

This has been quite a fun blog post to investigate, I hope you can take something useful from it and have as much fun with it as I’ve had with it. Every demo that I’ve seen using spatial data has been using maps, hopefully this shows an alternative use beyond geographical mapping.

OLAP Cube Documentation in SSRS part 2

In my previous post I described how to create a number of stored procedures that use Dynamic Management Views (DMVs) to return the metadata structure of an SSAS 2008 OLAP cube, including dimensions, attributes, measure groups, BUS matrix etc.

In this post I’m going to use those procs to create a set of SSRS 2008 reports that will serve as the automated documentation of your cube. I’m going to make the following assumptions:

  • You’ve read the part 1 post, and already have the stored procs in a database.
  • You know the basics of SSRS 2008

If you haven’t read part 1, you can jump to it here.

UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.

Firstly I create a basic template report which has an appropriate header, with date/time stamp, server name and logos etc. This means that all of the reports will have a common look and feel. I could of course make use of the new report parts in SSRS 2008 R2 for this, but to maintain compatibility with pre R2 I’ll keep it simple.

The expression in the box on the top right is just to display today’s date in UK format dd/mm/yyyy.

  =FORMAT(Today(),"dd/MM/yyyy")

The reports that we’ll build will include the following:

  • CubeDoc_Cubes.rdl – Entry page, list of all cubes in the database
  • CubeDoc_Cube.rdl – showing all measure groups and dimensions within the selected cube
  • CubeDoc_Dimension.rdl – showing all hierarchies, attributes and related measure groups
  • CubeDoc_MeasureGroup.rdl – showing all measures and related dimensions
  • CubeDoc_Search.rdl – search names and descriptions of cubes, dimensions, measure groups, etc

Create the first report (CubeDoc_Cubes.rdl) which will act as the entry screen and menu of cubes.
Add a dataset dsCubes, and point it at the stored proc dbo.upCubeDocCubes

The proc has a @Catalog parameter, which filters the result set to a specific SSAS database (catalog). We want to return all cubes from all catalogs, so set the parameter value to =Nothing

All we have to do now is add a table and pull in the dataset fields that we want to see.

We can then preview the report to test that it returns the right list of cubes. You should see something like this.

Note that the AdventureWorks database doesn’t contain any descriptions, so you won’t see any in the report but they will be there when you add descriptions to your own cubes.

The next report we’re going to write is the CubeDoc_Cube report, which will list the measure groups, dimensions and BUS matrix of a single cube. We’ll link the two reports together later on.

Create a new report, using the template report you created earlier (select the template report in the solution explorer window, then CTRL+C then CTRL+V) and rename the new file as CubeDoc_Cube.rdl.

Add a report parameter called @Catalog which should be Text. I’ve set mine to default to “Adventure Works DW 2008R2” to make testing easier.

Add a dataset called dsCubes, and point it at the dbo.upCubeDocCubes proc, and link the @Catalog dataset parameter to the @Catalog report parameter.

This dataset will query the available cubes for the selected catalog, and populate a new parameter which we’ll now create, called @Cube. This should also be a text parameter, but this time we’ll set the available values to those returned by the dsCubes dataset.

If you want you can also set the default value of the parameter to the CUBE_NAME field of dsCubes. This parameter is not a multi value parameter, so by defaulting it to the dataset it will just default to the first record.

We can now use @Catalog and @Cube parameters to query the available measure groups and dimensions.
So, create a three new datasets:

  • dsDimensions – pointing to dbo.upCubeDocDimensionsInCube
  • dsMeasureGroups – pointing to dbo.upCubeDocMeasureGroupsInCube
  • dsBusMatrix – pointing to dbo.upCubeDocBUSMatrix

Set each of their @Catalog parameters to the report’s @Catalog parameter, and their @Cube parameters to the report’s @Cube parameter.

Create two tables in the report, one for measure groups and one for dimensions. Drag in the fields that you want to see, and preview the report. You should see something like this.

I’ve added a couple of textbox titles for good measure.

The third dataset, dsBUSMatrix requires something a little more interesting. For those that aren’t familiar with Kimball’s BUS Matrix concept, it’s a grid that shows the relationship and connectivity between facts (measure groups) and their dimensions. As the name suggests, we’ll use SSRS’s Matrix control for this. Once you’ve added a matrix control onto the report, follow these steps (using the dsBUSMatrix dataset):

  • Drag DIMENSION_UNIQUE_NAME onto the Rows of the matrix
  • Drag MEASUREGROUP_NAME onto the columns of the matrix
  • Right click on the Data box of the matrix, and type in the expression below. This checks the cardinality of the dimensions/measures to determine whether it is a regular relationship, a many to many or a degenerate fact
=SWITCH(SUM(Fields!Relationship.Value)=0,"",
   Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"F",
   Fields!MEASUREGROUP_CARDINALITY.Value="MANY","M",
   True,"X")

This will either show an X if there is a regular relationship, or show an M or F if there’s a many to many or degenerate relationship respectively.
To make it easier to read, I also like to set the background colour of the Data textbox to highlight the type of relationship further.

=SWITCH(SUM(Fields!Relationship.Value)=0,"Transparent",
   Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"Yellow",
   Fields!MEASUREGROUP_CARDINALITY.Value="MANY","Red",
   True,"CornflowerBlue")

If you preview the report you should see the following

It shows the concept, but it needs a little tidying up. Centering the text in the data textbox helps, but we can also use a fantastic new feature in SSRS 2008 R2 to rotate the column titles. Simply set the WritingMode property in the Localization group to Rotate270, and then shrink the width of the column.

I’ve also added a title, with a key, and level of row grouping using the DIMENSION_MASTER_NAME field, which groups role playing dimensions by their master dimension. It should now look something like this.

That’s it for this report, so save it, then go back to the first report (CubeDoc_Cubes.rdl) and right click, properties on the [CUBE_NAME] textbox. Go to the action tab, and set the action to navigate to the CubeDoc_Cube report, passing through the CATALOG_NAME and CUBE_NAME fields from the dataset as the parameter values. This sets up a hyperlink from one report to the other, allowing users to navigate around the cube doc reports by clicking on what they want to know about.

We then need to do the same for 3 other reports:
CubeDoc_Dimensions

  • dbo.upCubeDocAttributesInDimension results in a table
  • dbo.upCubeDocMeasureGroupsForDimension results in a table
  • Add an extra @Dimension parameter, populated from dbo.upCubeDocDimensionsInCube

CubeDoc_MeasureGroup

  • dbo.upCubeDocMeasuresInMeasureGroup results in a table
  • dbo.upCubeDocDimensionsForMeasureGroup results in a table
  • Add an extra @MeasureGroup parameter, populated from dbo.upCubeDocMeasureGroupsInCube

CubeDoc_Search

  • Add an extra @Search parameter, text, with no default
  • Table containing results from dbo.upCubeDocSearch, using the @Search parameter

Link all appropriate textboxes (measure groups, dimensions, search etc.) to their relevant report using the report action, and hey presto – a fully automated, real time, self-documenting cube report.

In the next and final installment of this series of blog posts, we’ll explore SQL 2008’s spatial data to generate an automated star schema visualisation to add that little something extra to the reports.

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Lewis Prince
Reiss McSporran
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon