0845 643 64 63

SQL Server

SQL Server User Group – Birmingham

SQL Server User Group MidlandsIt’s only a week to go until the first Midlands SQL Server User Group, being held on March 10th 2011 at the Old Joint Stock pub in Birmingham.

We’ve got two of the best speakers in the UK lined up, Allan Mitchell and Neil Hambly, we’re putting food on for you (pork pies and chip butties!) and it’s being held in a pub so beer will also be involved.

SQL Server, pork pies and beer – all in one place?! Go on, how can you resist?!

Register for FREE here

Speed up SSIS by using a slower query

This isn’t a technical blog post of my own, but a shout out to Rob Farley and an excellent blog post explaining how to use SQL’s OPTION (FAST x) hint. He explains how you can speed up an SSIS data flow by slowing down the source query. It may seem illogical at first, but you’ll understand after you go and read Rob’s post!

Read Rob’s post here: Speeding up SSIS using OPTION (FAST)

How to add calculations in SSAS cubes

One of the most useful aspects of a Business Intelligence system is the ability to add calculations to create new measures. This centralises the logic of the calculation into a single place, ensuring consistency and standardisation across the user base.

By way of example, a simple calculation for profit (Income – Expenditure) wouldn’t be provided by the source database and historically would be implemented in each and every report. In a data warehouse and/or cube we can create the calculation in a single place for everyone to use.

This post highlights some of methods of doing this, each with their respective pros and cons.

Calculated Members in SSAS Cube


SSAS provides a ‘Calculations’ tab in the cube designer which allows you to create new measures using MDX. You can use any combination of existing measures and dimension attributes, along with the plethora of MDX functions available to create highly complex calculations.
Pros:

  • Very complex calculations can be created using all available MDX functions
  • No changes are required to the structure of the data warehouse
  • Changes to the calculation will apply to every record, historic and new
  • The results are not stored in the warehouse or cube, so no extra space is required
  • New calculations can be added without having to deploy or reprocess the cube
  • Calculations can be scoped to any level of aggregation and granularity. Different calculations can even be used for different scopes
  • Calculations can easily combine measures from different measure groups

Cons:

  • The calculation will not make use of SSAS cube aggregations, reducing performance
  • SSAS drill through actions will not work
  • The calculation results are not available in the data warehouse, only the cube

SQL Calculations in the Data Source View


There’s a layer in-between the data warehouse and the cube called the data source view (DSV). This presents the relevant tables in the warehouse to the cube, and can be used to enhance the underlying data with calculations. This can either be the dsv layer within the cube project, or I prefer to create SQL Server views to encapsulate the logic.
Pros:

  • No changes are required to the table structure of the data warehouse
  • Calculations use SQL not MDX, reducing the complexity
  • Changes to the calculation will apply to every record, historic and new
  • The calculation will make full use of SSAS cube aggregations
  • SSAS drill through actions will work
  • The results are not stored in the warehouse, so the size of the database does not increase

Cons:

  • The cube must be redeployed and reprocessed before the new measure is available
  • The results of the calculation must be valid at the granularity of the fact table
  • The calculation results are not available in the data warehouse, only the cube

Calculate in the ETL process


Whilst bringing in data from the source data systems, it sometimes makes sense to perform calculations on the data at that point, and store the results in the warehouse.
Pros:

  • The results of the calculation will be available when querying the warehouse as well as the cube
  • In the ETL pipeline you can import other data sources (using lookups etc.) to utilise other data in the calculation
  • If the calculation uses time based data, or data valid at a specific time (i.e. share price) then by performing the calculation in the ETL, the correct time based data is used, without having to store the full history of the underlying source data
  • The calculation will make full use of SSAS cube aggregations
  • SSAS drill through actions will work

Cons:

  • You have to be able to alter the structure of the data warehouse, which isn’t always an option.
  • The results are stored in the warehouse, increasing the size of the database
  • The results of the calculation must be valid at the granularity of the fact table
  • If the calculation logic changes, all existing records must be updated

In Conclusion

If the calculation is valid for a single record, and it would be of benefit to have access to the results in the warehouse, then perform the calculation in the ETL pipeline and store teh results in the warehouse.

If the calculation is valid for a single record, and it would not be of benefit to have the results in the warehouse, then calculate it in the data source view.

If the calculation is too complex for SQL, requiring MDX functions, then create an MDX calculated measure in the cube.

OLAP Cube Documentation in SSRS part 3

This is the 3rd and final post in this series of blog posts, showing how you can use SQL Server Reporting Services (SSRS), DMVs and spatial data to generate real time automated user guide documentation for your Analysis Services (SSAS) OLAP cube.

UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.

In this post, I’m going to enhance the measure group report to include a visualisation of the star schema. To do this I’ll be enhancing one of the stored procedures to utilise SQL Server 2008’s new spatial data types combined with SSRS 2008 R2’s new map functionality.

To do this we’ll update the dbo.upCubeDocDimensionsForMeasureGroup stored proc so that it returns a SQL geometry polygon for each row, in the right place around the circumference of the star. There’s a little math in this, but nothing more than a bit of trigonometry.

First the theory. We have an arbitrary number of dimensions that we need to place in a circle around a central point (the measure group). If we have 6 dimensions, then we need to divide the whole circle (360 degrees) by 6 (=60 degrees each) to get the angle of each around the hypothetical axis.

Therefore the first dimension needs to be at 60, the second at 120, the third at 180 etc, with the 6th at 360, completing the full circle.
Obviously the angle needs to vary depending on the number of dimensions in the query, so we need to calculate it within the stored proc. To do this I’m using common table expressions (CTE) to perform further calculations on the basic query.

We wrap the original proc query into a CTE and call it BaseData. We also add an extra field called Seq, which uniquely identifies each row, we’ll use this later to enable us to rank the dimensions.

;WITH BaseData AS
(
    SELECT
          mgd.*
        , d.[DESCRIPTION]
        , REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS DimensionCaption
        , REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS MeasureGroupCaption
    FROM OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                   +[CUBE_NAME]
                   +[MEASUREGROUP_NAME]
                   +[DIMENSION_UNIQUE_NAME] AS Seq
                , [CATALOG_NAME]
                , [CUBE_NAME]
                , [MEASUREGROUP_NAME]
                , [MEASUREGROUP_CARDINALITY]
                , [DIMENSION_UNIQUE_NAME]
                , [DIMENSION_CARDINALITY]
                , [DIMENSION_IS_VISIBLE]
                , [DIMENSION_IS_FACT_DIMENSION]
                , [DIMENSION_GRANULARITY]
            FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') mgd
        INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                ,[CUBE_NAME]
                ,[DIMENSION_UNIQUE_NAME]
                ,[DESCRIPTION]
            FROM $SYSTEM.MDSCHEMA_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') d
                ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
                  = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
                AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
                  = CAST(d.[CUBE_NAME] AS VARCHAR(255))
                AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
                  = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))     = @Catalog
       AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))         = @Cube
       AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)

We’ll then add a new CTE which calculated the number of records returned by the previous query.

,TotCount AS
(
    SELECT COUNT(*) AS RecCount FROM BaseData
)

Next we cross join TotCount with the base data, so that every row has the extra RecCount field. We then rank each record, providing each with a unique number from 1 to n.

, RecCount AS
(
    SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
        , RecCount
        , BaseData.*
    FROM
        BaseData CROSS JOIN TotCount
)

Each record now contains its row number, as well as the total number of rows, so it’s easy to calculate its position around the circle (rank/n * 360). Now we have that, calculating the x and y coordinates of each dimension is simply a case of applying Sine and Cosine. Note that the SQL SIN and COS functions expect angles to be provided in radians not degrees, so we have to use the RADIANS function to convert it for us. I’m also multiplying the result by 1000 to scale the numbers up from -1 to +1 to -1000 to +1000, which makes our life easier later on.

, Angles AS
(
    SELECT
        *
        , SIN(RADIANS((CAST(RecID AS FLOAT)
            /CAST(RecCount AS FLOAT))
            * 360)) * 1000 AS x
        , COS(RADIANS((CAST(RecID AS FLOAT)
            /CAST(RecCount AS FLOAT))
            * 360)) * 1000 AS y
    FROM RecCount
)

We can now use the x and y coordinates to create a point indicating the position of each dimension, using the code below.

geometry::STGeomFromText('POINT('
   + CAST(y AS VARCHAR(20))
   + ' '
   + CAST(x AS VARCHAR(20))
   + ')',4326) AS Posn

This is a good start, but we want a polygon box, not a single point. We can use a similar geometry function to create a polygon around our point.

geometry::STPolyFromText('POLYGON ((' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
 	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
      + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
   CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
	  + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
   ))',0) AS Box

You’ll notice that I’m multiplying the y axis by a @Stretch variable. This is to allow us to squash or squeeze the resulting star to make it look better in the report. I’m also using a @BoxSize variable which we can use to change the relative size of the boxes. It’s for this reason why I like to work on a -1000 to +1000 scale, it means we can have an integer box size of say 250 instead of a fraction such as 0.25, I just think it’s easier to read.
So you’ll now have a stored proc similar to this.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
    (@Catalog       VARCHAR(255)
    ,@Cube          VARCHAR(255)
    ,@MeasureGroup  VARCHAR(255)
    )
AS

 DECLARE @BoxSize INT
 DECLARE @Stretch FLOAT
 SET @BoxSize = 250
 SET @Stretch = 1.4

;WITH BaseData AS
(
    SELECT
          mgd.*
        , d.[DESCRIPTION]
        , REPLACE(REPLACE(CAST(mgd.[DIMENSION_UNIQUE_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS DimensionCaption
        , REPLACE(REPLACE(CAST(mgd.[MEASUREGROUP_NAME]
              AS VARCHAR(255))
              ,'[',''),']','') AS MeasureGroupCaption
    FROM OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                   +[CUBE_NAME]
                   +[MEASUREGROUP_NAME]
                   +[DIMENSION_UNIQUE_NAME] AS Seq
                , [CATALOG_NAME]
                , [CUBE_NAME]
                , [MEASUREGROUP_NAME]
                , [MEASUREGROUP_CARDINALITY]
                , [DIMENSION_UNIQUE_NAME]
                , [DIMENSION_CARDINALITY]
                , [DIMENSION_IS_VISIBLE]
                , [DIMENSION_IS_FACT_DIMENSION]
                , [DIMENSION_GRANULARITY]
            FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') mgd
        INNER JOIN OPENQUERY(CubeLinkedServer, 'SELECT
                [CATALOG_NAME]
                ,[CUBE_NAME]
                ,[DIMENSION_UNIQUE_NAME]
                ,[DESCRIPTION]
            FROM $SYSTEM.MDSCHEMA_DIMENSIONS
            WHERE [DIMENSION_IS_VISIBLE]') d
                ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
                  = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
                AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
                  = CAST(d.[CUBE_NAME] AS VARCHAR(255))
                AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
                  = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))      = @Catalog
        AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))         = @Cube
        AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
)
,TotCount AS
(
    SELECT COUNT(*) AS RecCount FROM BaseData
)
, RecCount AS
(
    SELECT RANK() OVER (ORDER BY CAST(Seq AS VARCHAR(255))) AS RecID
        , RecCount
        , BaseData.*
    FROM
        BaseData CROSS JOIN TotCount
)
, Angles AS
(
    SELECT
        *
        , SIN(RADIANS((CAST(RecID    AS FLOAT)
             /CAST(RecCount AS FLOAT))
             * 360)) * 1000 AS x
        , COS(RADIANS((CAST(RecID AS FLOAT)
             /CAST(RecCount AS FLOAT))
             * 360)) * 1000 AS y
    FROM RecCount
)
,Results AS
(
    SELECT
        *
        , geometry::STGeomFromText('POINT('
            + CAST(y AS VARCHAR(20))
            + ' '
            + CAST(x AS VARCHAR(20))
            + ')',4326) AS Posn
        , geometry::STPolyFromText('POLYGON ((' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS Box
    FROM Angles
)
SELECT * FROM Results
GO

If you then execute this in Management Studio, you’ll notice an extra tab in the result window called Spatial Results.

EXEC [dbo].[upCubeDocDimensionsForMeasureGroup]
    @Catalog = 'Adventure Works DW 2008R2',
    @Cube = 'Adventure Works',
    @MeasureGroup = 'Financial Reporting'

Click on the Spatial Results tab, then select Box as the spatial column, and you’ll see the boxes that we’ve created in a preview window.

This is now getting somewhere close. But as well as the dimensions, we also want to show the measure group in the middle, as well as lines linking them together to actuallly create our star. We can do this by adding a couple more geometry functions to our query. We end up with the end of our proc looking like this.

,Results AS
(
    SELECT
        *
        , geometry::STGeomFromText('POINT('
            + CAST(y AS VARCHAR(20))
            + ' '
            + CAST(x AS VARCHAR(20))
            + ')',4326) AS Posn
        , geometry::STPolyFromText('POLYGON ((' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST((y*@Stretch)+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(x+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS Box
         , geometry::STLineFromText('LINESTRING (0 0, '
              + CAST((y*@Stretch) AS VARCHAR(20))
              + ' ' + CAST(x AS VARCHAR(20))
              + ')', 0) AS Line
         , geometry::STPolyFromText('POLYGON ((' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0-@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0-(@BoxSize/2) AS VARCHAR(20)) + ', ' +
            CAST(0+@BoxSize AS VARCHAR(20)) + ' '
              + CAST(0+(@BoxSize/2) AS VARCHAR(20)) + '
            ))',0) AS CenterBox

    FROM Angles
)
SELECT * FROM Results
GO

So, we’ve now got the polygons and lines being generated by the proc, it’s now time to add them into the report and display them to our users.

Firstly open up the CubeDoc_MeasureGroup.rdl report, and go to the properties of the dsDimensions dataset. Click Refresh Fields and the new x, y, Posn, Box, Line and CenterBox fields should now be available.
Then drag a Map from the toolbox onto the report. This will start the map wizard.
Select SQL Server Spatial Query as the source for the data and click Next.
Choose dsDimensions as the dataset to use for the data and click Next.
Choose Box as the spatial field, and Polygon as the layer type. It may well give you an error, just ignore it.
Don’t select the embed map or Bing maps layer.

Click Next, then select a basic map, select the default options for the remaining wizard stages and you’ll end up with a map in your report.

If you preview the report at this stage you won’t see the polygons. This is because the map still thinks it’s a geographical map, and it is trying to draw our boxes as latitude and longitudes. We don’t want this, but want it to show them on our own scale. To fix this, just change the CoordinateSystem property of the map from Geographic to Planar.

You can then preview the report, which should show you something like this

It still doesn’t look like a star but we’ve still got a few more changes to make. We need to add a couple more layers to the map, the center box for the measure group and then the lines to link them all together.
Add a new polygon layer to the map and then set the layer data to use the CenterBox field from the dsDimensions dataset.

Repeat the above, but with a new line layer instead of a polygon layer. Set the layer data to use the Line field of dsDimensions.

To move the lines behind the boxes, just rearrange the order of the layers by using the blue arrows in the map layers window. We want the Lines layer to be at the bottom.

Set the PolygonTemplate.BackgroundColour property of the CenterBox layer to Khaki, and the set the same property of the dimension box layer to LightBlue.

Then set the PolygonTemplate.Label property of the CenterBox layer to the MeasureGroupCaption field, and set the ShowLabel property to True. If you don’t then SSRS will decide whether or not to show the label, we want it to always show.
Set the PolygonTemplate.Label property of the Dimension layer to the DimensionCaption field, and set the ShowLabel property to True.
You can then play around with font sizes, zoom, background colours, line widths etc. to get the effects that you want, but you’ll end up with a star schema visualisation similar to this.

You can also configure actions for each layer. Using this you can hyperlink each dimension box to show the CubeDoc_Dimension report for the selected dimension etc, making the star schema interactive.

This has been quite a fun blog post to investigate, I hope you can take something useful from it and have as much fun with it as I’ve had with it. Every demo that I’ve seen using spatial data has been using maps, hopefully this shows an alternative use beyond geographical mapping.

OLAP Cube Documentation in SSRS part 2

In my previous post I described how to create a number of stored procedures that use Dynamic Management Views (DMVs) to return the metadata structure of an SSAS 2008 OLAP cube, including dimensions, attributes, measure groups, BUS matrix etc.

In this post I’m going to use those procs to create a set of SSRS 2008 reports that will serve as the automated documentation of your cube. I’m going to make the following assumptions:

  • You’ve read the part 1 post, and already have the stored procs in a database.
  • You know the basics of SSRS 2008

If you haven’t read part 1, you can jump to it here.

UPDATE: I presented a 1 hour session at SQLBits 8 covering this work, you can watch the video here.

Firstly I create a basic template report which has an appropriate header, with date/time stamp, server name and logos etc. This means that all of the reports will have a common look and feel. I could of course make use of the new report parts in SSRS 2008 R2 for this, but to maintain compatibility with pre R2 I’ll keep it simple.

The expression in the box on the top right is just to display today’s date in UK format dd/mm/yyyy.

  =FORMAT(Today(),"dd/MM/yyyy")

The reports that we’ll build will include the following:

  • CubeDoc_Cubes.rdl – Entry page, list of all cubes in the database
  • CubeDoc_Cube.rdl – showing all measure groups and dimensions within the selected cube
  • CubeDoc_Dimension.rdl – showing all hierarchies, attributes and related measure groups
  • CubeDoc_MeasureGroup.rdl – showing all measures and related dimensions
  • CubeDoc_Search.rdl – search names and descriptions of cubes, dimensions, measure groups, etc

Create the first report (CubeDoc_Cubes.rdl) which will act as the entry screen and menu of cubes.
Add a dataset dsCubes, and point it at the stored proc dbo.upCubeDocCubes

The proc has a @Catalog parameter, which filters the result set to a specific SSAS database (catalog). We want to return all cubes from all catalogs, so set the parameter value to =Nothing

All we have to do now is add a table and pull in the dataset fields that we want to see.

We can then preview the report to test that it returns the right list of cubes. You should see something like this.

Note that the AdventureWorks database doesn’t contain any descriptions, so you won’t see any in the report but they will be there when you add descriptions to your own cubes.

The next report we’re going to write is the CubeDoc_Cube report, which will list the measure groups, dimensions and BUS matrix of a single cube. We’ll link the two reports together later on.

Create a new report, using the template report you created earlier (select the template report in the solution explorer window, then CTRL+C then CTRL+V) and rename the new file as CubeDoc_Cube.rdl.

Add a report parameter called @Catalog which should be Text. I’ve set mine to default to “Adventure Works DW 2008R2” to make testing easier.

Add a dataset called dsCubes, and point it at the dbo.upCubeDocCubes proc, and link the @Catalog dataset parameter to the @Catalog report parameter.

This dataset will query the available cubes for the selected catalog, and populate a new parameter which we’ll now create, called @Cube. This should also be a text parameter, but this time we’ll set the available values to those returned by the dsCubes dataset.

If you want you can also set the default value of the parameter to the CUBE_NAME field of dsCubes. This parameter is not a multi value parameter, so by defaulting it to the dataset it will just default to the first record.

We can now use @Catalog and @Cube parameters to query the available measure groups and dimensions.
So, create a three new datasets:

  • dsDimensions – pointing to dbo.upCubeDocDimensionsInCube
  • dsMeasureGroups – pointing to dbo.upCubeDocMeasureGroupsInCube
  • dsBusMatrix – pointing to dbo.upCubeDocBUSMatrix

Set each of their @Catalog parameters to the report’s @Catalog parameter, and their @Cube parameters to the report’s @Cube parameter.

Create two tables in the report, one for measure groups and one for dimensions. Drag in the fields that you want to see, and preview the report. You should see something like this.

I’ve added a couple of textbox titles for good measure.

The third dataset, dsBUSMatrix requires something a little more interesting. For those that aren’t familiar with Kimball’s BUS Matrix concept, it’s a grid that shows the relationship and connectivity between facts (measure groups) and their dimensions. As the name suggests, we’ll use SSRS’s Matrix control for this. Once you’ve added a matrix control onto the report, follow these steps (using the dsBUSMatrix dataset):

  • Drag DIMENSION_UNIQUE_NAME onto the Rows of the matrix
  • Drag MEASUREGROUP_NAME onto the columns of the matrix
  • Right click on the Data box of the matrix, and type in the expression below. This checks the cardinality of the dimensions/measures to determine whether it is a regular relationship, a many to many or a degenerate fact
=SWITCH(SUM(Fields!Relationship.Value)=0,"",
   Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"F",
   Fields!MEASUREGROUP_CARDINALITY.Value="MANY","M",
   True,"X")

This will either show an X if there is a regular relationship, or show an M or F if there’s a many to many or degenerate relationship respectively.
To make it easier to read, I also like to set the background colour of the Data textbox to highlight the type of relationship further.

=SWITCH(SUM(Fields!Relationship.Value)=0,"Transparent",
   Fields!DIMENSION_IS_FACT_DIMENSION.Value=1,"Yellow",
   Fields!MEASUREGROUP_CARDINALITY.Value="MANY","Red",
   True,"CornflowerBlue")

If you preview the report you should see the following

It shows the concept, but it needs a little tidying up. Centering the text in the data textbox helps, but we can also use a fantastic new feature in SSRS 2008 R2 to rotate the column titles. Simply set the WritingMode property in the Localization group to Rotate270, and then shrink the width of the column.

I’ve also added a title, with a key, and level of row grouping using the DIMENSION_MASTER_NAME field, which groups role playing dimensions by their master dimension. It should now look something like this.

That’s it for this report, so save it, then go back to the first report (CubeDoc_Cubes.rdl) and right click, properties on the [CUBE_NAME] textbox. Go to the action tab, and set the action to navigate to the CubeDoc_Cube report, passing through the CATALOG_NAME and CUBE_NAME fields from the dataset as the parameter values. This sets up a hyperlink from one report to the other, allowing users to navigate around the cube doc reports by clicking on what they want to know about.

We then need to do the same for 3 other reports:
CubeDoc_Dimensions

  • dbo.upCubeDocAttributesInDimension results in a table
  • dbo.upCubeDocMeasureGroupsForDimension results in a table
  • Add an extra @Dimension parameter, populated from dbo.upCubeDocDimensionsInCube

CubeDoc_MeasureGroup

  • dbo.upCubeDocMeasuresInMeasureGroup results in a table
  • dbo.upCubeDocDimensionsForMeasureGroup results in a table
  • Add an extra @MeasureGroup parameter, populated from dbo.upCubeDocMeasureGroupsInCube

CubeDoc_Search

  • Add an extra @Search parameter, text, with no default
  • Table containing results from dbo.upCubeDocSearch, using the @Search parameter

Link all appropriate textboxes (measure groups, dimensions, search etc.) to their relevant report using the report action, and hey presto – a fully automated, real time, self-documenting cube report.

In the next and final installment of this series of blog posts, we’ll explore SQL 2008’s spatial data to generate an automated star schema visualisation to add that little something extra to the reports.

OLAP Cube Documentation in SSRS part 1

Being a business intelligence consultant, I like to spend my time designing data warehouses, ETL scripts and OLAP cubes. An unfortunate consequence of this is having to write the documentation that goes with the fun techy work. So it got me thnking, is there a slightly more fun techy way of automating the documentation of OLAP cubes…

There are some good tools out there such as BI Documenter, but I wanted a way of having more control over the output, and also automating it further so that you don’t have to run an overnight build of the documentation.

I found a great article by Vincent Rainardi describing some DMVs (Dynamic Management Views) available in SQL 2008 which got me thinking, why not just build a number of SSRS reports calling these DMVs, which would then dynamically create the cube structure documentation in real time whenever the report rendered..

This post is the first in a 3 part set which will demonstrate how you can use these DMVs to automate the SSAS cube documentation and user guide.

UPDATE: I presented a 1 hour session at SQLBits 8 covering all of this work, you can watch the video here.

There’s a full list of DMVs available in SQL 2008 R2 on the msdn site.

The primary DMVs that are of interest are:

DMV Description
MDSCHEMA_CUBES Lists the cubes in an SSAS database
MDSCHEMA_MEASUREGROUPS Lists measure groups
MDSCHEMA_DIMENSIONS Lists dimensions
MDSCHEMA_LEVELS Dimension attributes
MDSCHEMA_MEASUREGROUP_DIMENSIONS Enumerates dimensions of measure groups
MDSCHEMA_MEASURES Lists measures

When querying DMVs we can use SQL style SELECT statements, but executed against the cube in a DMX window.

SELECT *
FROM $SYSTEM.MDSCHEMA_CUBES

This returns a dataset like any other SQL query.

We can even enhance it with DISTINCT and WHERE clauses, although they are more restricted than basic SQL. One of the main limitations is the lack of a JOIN operator. A number of the queries that I’ll perform below need to use JOIN, so to get around this I wrap up each query in an SQL OPENROWSET command, executed against a SQL database with a linked server to the cube. This enables me to perform JOINs using queries such as

SELECT *
FROM OPENQUERY(CubeLinkedServer,
   'SELECT *
    FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS') mgd
INNER JOIN OPENQUERY(CubeLinkedServer,
   'SELECT *
    FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS') mg
ON mgd.XXX = mg.XXX


etc.

I’m therefore going to create a number of stored procs to wrap up this functionality, the SSRS reports can then just call the procs.

Within BIDS, every item (cube, measure group, measure, dimension, attribute, hierarchy, KPI, etc.) has a description in the properties pane which is a multi-line free text property. These are exposed by the DMVs, so I’m going to make use of them and bring them out in the reports. This allows you to create the descriptions within BIDS as you’re developing the cube, meaning they’re version controlled and always in sync with the code.

I should also point out that I’m using SQL Server 2008 R2. All of the queries below will work with SQL 2008, but I want to use the spatial report functionality of SSRS 2008 R2 to generate dynamic star schema visualisations, which is only supported in R2.

In this post I’ll script out the stored procedures used as the basis of the documentation. In my next post I’ll put these into SSRS reports.

Lets get started.

Firstly we need to create our linked server. This script will create a linked server called CubeLinkedServer pointing to the Adventure Works DW 2008R2 OLAP database on the local server.

EXEC master.dbo.sp_addlinkedserver
   @server = N'CubeLinkedServer',
   @srvproduct=N'MSOLAP',
   @provider=N'MSOLAP',
   @datasrc=N'(local)',
   @catalog=N'Adventure Works DW 2008R2'


You’ll have to set up the security according to your requirements. So now lets start creating the source procs.

The first proc lists all of the cubes. The MDSCHEMA_CUBES DMV returns not only cubes, but also dimensions, I’m filtering it to only return cubes by specifying CUBE_SOURCE=1.

CREATE PROCEDURE [dbo].[upCubeDocCubes]
  (@Catalog       VARCHAR(255) = NULL
  )
AS
  SELECT *
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT *
     FROM $SYSTEM.MDSCHEMA_CUBES
     WHERE CUBE_SOURCE = 1')
  WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
    OR @Catalog IS NULL
GO


The next proc returns all measure groups found within a specified cube.

CREATE PROCEDURE [dbo].[upCubeDocMeasureGroupsInCube]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  )
AS
  SELECT *
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT *
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS ')
  WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
    AND CAST([CUBE_NAME] AS VARCHAR(255))    = @Cube
GO


This next proc returns a list of measures within a specified measure group.

CREATE PROCEDURE [dbo].[upCubeDocMeasuresInMeasureGroup]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  ,@MeasureGroup  VARCHAR(255)
  )
AS
SELECT * FROM OPENQUERY(CubeLinkedServer,
  'SELECT *
   FROM $SYSTEM.MDSCHEMA_MEASURES
     WHERE [MEASURE_IS_VISIBLE]')
   WHERE CAST([CATALOG_NAME] AS VARCHAR(255))      = @Catalog
     AND CAST([CUBE_NAME] AS VARCHAR(255))         = @Cube
     AND CAST([MEASUREGROUP_NAME] AS VARCHAR(255)) = @MeasureGroup
GO


The following proc queries all dimensions available within a specified cube. I’m filtering using the DIMENSION_IS_VISIBLE column to only show visible dimensions.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsInCube]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  )
AS
SELECT * FROM OPENQUERY(CubeLinkedServer,
  'SELECT *
   FROM $SYSTEM.MDSCHEMA_DIMENSIONS
     WHERE [DIMENSION_IS_VISIBLE]')
   WHERE CAST([CATALOG_NAME] AS VARCHAR(255)) = @Catalog
     AND CAST([CUBE_NAME] AS VARCHAR(255))    = @Cube
GO


Then we can query all available attributes within a dimension. This DMV returns a bitmask field (LEVEL_ORIGIN) which defines whether the attribute is a key, attribute or hierarchy. I’m using bitwise AND (&) to split this into three seperate fields for ease of use. I’m also filtering out invisible attributes, as well as those with a level of 0. Level 0 is the [All] member of any attribute, which we can ignore for this purpose.

CREATE PROCEDURE [dbo].[upCubeDocAttributesInDimension]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  ,@Dimension  VARCHAR(255)
  )
AS
  SELECT *
    , CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 1 = 1
        THEN 1 ELSE 0 END AS IsHierarchy
    , CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 2 = 2
        THEN 1 ELSE 0 END AS IsAttribute
    , CASE WHEN CAST([LEVEL_ORIGIN] AS INT) & 4 = 4
        THEN 1 ELSE 0 END AS IsKey
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT *
     FROM $SYSTEM.MDSCHEMA_LEVELS
     WHERE [LEVEL_NUMBER]>0
       AND [LEVEL_IS_VISIBLE]')
  WHERE CAST([CATALOG_NAME] AS VARCHAR(255))          = @Catalog
    AND CAST([CUBE_NAME] AS VARCHAR(255))             = @Cube
    AND CAST([DIMENSION_UNIQUE_NAME] AS VARCHAR(255)) = @Dimension
GO


The next proc returns measure groups with their associated dimensions. We have to join two DMVs together in order to get the description columns of both the dimension and measure group.

CREATE PROCEDURE [dbo].[upCubeDocMeasureGroupsForDimension]
    (@Catalog       VARCHAR(255)
    ,@Cube          VARCHAR(255)
    ,@Dimension     VARCHAR(255)
    )
AS
  SELECT
    mgd.*
    , m.[DESCRIPTION]
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT
       [CATALOG_NAME]
       , [CUBE_NAME]
       , [MEASUREGROUP_NAME]
       , [MEASUREGROUP_CARDINALITY]
       , [DIMENSION_UNIQUE_NAME]
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
       WHERE [DIMENSION_IS_VISIBLE]') mgd
   INNER JOIN OPENQUERY(CubeLinkedServer,
     'SELECT
       [CATALOG_NAME]
       ,[CUBE_NAME]
       ,[MEASUREGROUP_NAME]
       ,[DESCRIPTION]
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS') mg
        ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
           = CAST(mg.[CATALOG_NAME] AS VARCHAR(255))
        AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
           = CAST(mg.[CUBE_NAME] AS VARCHAR(255))
        AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255))
           = CAST(mg.[MEASUREGROUP_NAME] AS VARCHAR(255))
  WHERE CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))            = @Catalog
    AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))               = @Cube
    AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))   = @Dimension
GO


The next proc is similar to the above, but the opposite way around. It returns all dimensions that are related to a measure group.

CREATE PROCEDURE [dbo].[upCubeDocDimensionsForMeasureGroup]
  (@Catalog       VARCHAR(255)
  ,@Cube          VARCHAR(255)
  ,@MeasureGroup  VARCHAR(255)
  )
AS
  SELECT
    mgd.*
    , d.[DESCRIPTION]
  FROM OPENQUERY(CubeLinkedServer,
    'SELECT
        [CATALOG_NAME]
       ,[CUBE_NAME]
       ,[MEASUREGROUP_NAME]
       ,[MEASUREGROUP_CARDINALITY]
       ,[DIMENSION_UNIQUE_NAME]
       ,[DIMENSION_CARDINALITY]
       ,[DIMENSION_IS_VISIBLE]
       ,[DIMENSION_IS_FACT_DIMENSION]
       ,[DIMENSION_GRANULARITY]
     FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
       WHERE [DIMENSION_IS_VISIBLE]') mgd
  INNER JOIN OPENQUERY(CubeLinkedServer,
    'SELECT
       [CATALOG_NAME]
       ,[CUBE_NAME]
       ,[DIMENSION_UNIQUE_NAME]
       ,[DESCRIPTION]
     FROM $SYSTEM.MDSCHEMA_DIMENSIONS
       WHERE [DIMENSION_IS_VISIBLE]') d
   ON  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))
       = CAST(d.[CATALOG_NAME] AS VARCHAR(255))
   AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))
       = CAST(d.[CUBE_NAME] AS VARCHAR(255))
   AND CAST(mgd.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
       = CAST(d.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
  WHERE  CAST(mgd.[CATALOG_NAME] AS VARCHAR(255))        = @Catalog
     AND CAST(mgd.[CUBE_NAME] AS VARCHAR(255))           = @Cube
     AND CAST(mgd.[MEASUREGROUP_NAME] AS VARCHAR(255))   = @MeasureGroup
GO


The next proc builds a BUS matrix, joining every dimension to its related measure groups. Later we’ll use the SSRS tablix control to pivot this into matrix form.

CREATE PROCEDURE [dbo].[upCubeDocBUSMatrix]
    (@Catalog       VARCHAR(255),
     @Cube          VARCHAR(255)
    )
AS
  SELECT
     bus.[CATALOG_NAME]
    ,bus.[CUBE_NAME]
    ,bus.[MEASUREGROUP_NAME]
    ,bus.[MEASUREGROUP_CARDINALITY]
    ,bus.[DIMENSION_UNIQUE_NAME]
    ,bus.[DIMENSION_CARDINALITY]
    ,bus.[DIMENSION_IS_FACT_DIMENSION]
    ,bus.[DIMENSION_GRANULARITY]
    ,dim.[DIMENSION_MASTER_NAME]
    ,1 AS Relationship
  FROM
    OPENQUERY(CubeLinkedServer,
      'SELECT
        [CATALOG_NAME]
        ,[CUBE_NAME]
        ,[MEASUREGROUP_NAME]
        ,[MEASUREGROUP_CARDINALITY]
        ,[DIMENSION_UNIQUE_NAME]
        ,[DIMENSION_CARDINALITY]
        ,[DIMENSION_IS_FACT_DIMENSION]
        ,[DIMENSION_GRANULARITY]
       FROM $SYSTEM.MDSCHEMA_MEASUREGROUP_DIMENSIONS
        WHERE [DIMENSION_IS_VISIBLE]') bus
    INNER JOIN OPENQUERY(CubeLinkedServer,
      'SELECT
        [CATALOG_NAME]
        ,[CUBE_NAME]
        ,[DIMENSION_UNIQUE_NAME]
        ,[DIMENSION_MASTER_NAME]
       FROM $SYSTEM.MDSCHEMA_DIMENSIONS') dim
    ON CAST(bus.[CATALOG_NAME] AS VARCHAR(255))
     = CAST(dim.[CATALOG_NAME] AS VARCHAR(255))
    AND CAST(bus.[CUBE_NAME] AS VARCHAR(255))
     = CAST(dim.[CUBE_NAME] AS VARCHAR(255))
    AND CAST(bus.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
     = CAST(dim.[DIMENSION_UNIQUE_NAME] AS VARCHAR(255))
  WHERE  CAST(bus.[CATALOG_NAME] AS VARCHAR(255)) = @Catalog
     AND CAST(bus.[CUBE_NAME] AS VARCHAR(255)) = @Cube
GO


Next, in order to make it easier for users to find items within the cube, I’ve created a searching proc which will scour a number of the DMVs for anything containing the search term.

CREATE PROCEDURE [dbo].[upCubeDocSearch]
    (@Search        VARCHAR(255)
    ,@Catalog       VARCHAR(255)=NULL
    ,@Cube          VARCHAR(255)=NULL
    )
AS
  WITH MetaData AS
  (
   --Cubes
    SELECT CAST('Cube' AS VARCHAR(20))            AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))     AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Cube]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME], [DESCRIPTION]
       FROM $SYSTEM.MDSCHEMA_CUBES
       WHERE CUBE_SOURCE = 1')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
       = @Catalog OR @Catalog IS NULL)

    UNION ALL

   --Dimensions
    SELECT CAST('Dimension' AS VARCHAR(20))         AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))       AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))             AS [Cube]
      , CAST(DIMENSION_NAME AS VARCHAR(255))        AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000))   AS [Description]
      , CAST(DIMENSION_UNIQUE_NAME AS VARCHAR(255)) AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME]
          , [DIMENSION_NAME], [DESCRIPTION]
          , [DIMENSION_UNIQUE_NAME]
       FROM $SYSTEM.MDSCHEMA_DIMENSIONS
         WHERE [DIMENSION_IS_VISIBLE]')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
        = @Catalog OR @Catalog IS NULL)
      AND (CAST([CUBE_NAME] AS VARCHAR(255))
        = @Cube OR @Cube IS NULL)
      AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
        <>'$' --Filter out dimensions not in a cube

    UNION ALL

   --Attributes
    SELECT CAST('Attribute' AS VARCHAR(20))         AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))       AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))             AS [Cube]
      , CAST(LEVEL_CAPTION AS VARCHAR(255))         AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000))   AS [Description]
      , CAST(DIMENSION_UNIQUE_NAME AS VARCHAR(255)) AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME]
         , [LEVEL_CAPTION], [DESCRIPTION],
         , [DIMENSION_UNIQUE_NAME]
       FROM $SYSTEM.MDSCHEMA_LEVELS
       WHERE [LEVEL_NUMBER]>0
         AND [LEVEL_IS_VISIBLE]')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
         = @Catalog OR @Catalog IS NULL)
      AND (CAST([CUBE_NAME] AS VARCHAR(255))
         = @Cube OR @Cube IS NULL)
      AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
         <>'$' --Filter out dimensions not in a cube

    UNION ALL

   --Measure Groups
    SELECT CAST('Measure Group' AS VARCHAR(20))   AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))     AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Cube]
      , CAST(MEASUREGROUP_NAME AS VARCHAR(255))   AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
      , CAST(MEASUREGROUP_NAME AS VARCHAR(255))   AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
       'SELECT [CATALOG_NAME], [CUBE_NAME]
          , [MEASUREGROUP_NAME],
          , [DESCRIPTION]
        FROM $SYSTEM.MDSCHEMA_MEASUREGROUPS')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
       = @Catalog OR @Catalog IS NULL)
     AND (CAST([CUBE_NAME] AS VARCHAR(255))
       = @Cube OR @Cube IS NULL)
     AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
       <>'$' --Filter out dimensions not in a cube

    UNION ALL

   --Measures
    SELECT CAST('Measure' AS VARCHAR(20))         AS [Type]
      , CAST(CATALOG_NAME AS VARCHAR(255))     AS [Catalog]
      , CAST(CUBE_NAME AS VARCHAR(255))           AS [Cube]
      , CAST(MEASURE_NAME AS VARCHAR(255))        AS [Name]
      , CAST(DESCRIPTION AS VARCHAR(4000)) AS [Description]
      , CAST(MEASUREGROUP_NAME AS VARCHAR(255))   AS [Link]
    FROM OPENQUERY(CubeLinkedServer,
      'SELECT [CATALOG_NAME], [CUBE_NAME]
         , [MEASURE_NAME], [DESCRIPTION],
         , [MEASUREGROUP_NAME]
       FROM $SYSTEM.MDSCHEMA_MEASURES
          WHERE [MEASURE_IS_VISIBLE]')
    WHERE  (CAST([CATALOG_NAME] AS VARCHAR(255))
          = @Catalog OR @Catalog IS NULL)
      AND (CAST([CUBE_NAME] AS VARCHAR(255))
          = @Cube OR @Cube IS NULL)
      AND LEFT(CAST(CUBE_NAME AS VARCHAR(255)),1)
          <>'$' --Filter out dimensions not in a cube

    )
    SELECT *
    FROM MetaData
    WHERE @Search<>''
        AND ([Name] LIKE '%' + @Search + '%'
          OR [Description] LIKE '%' + @Search + '%'
        )
GO


We can now use these procs to form the basis of a number of SSRS reports which will dynamically query the DMVs to generate the SSAS cube documentation. I’ll be covering this stage in my next post.

 

Leap before you can land

Leap before you can land…

Research scientists at the Southern Illinois University have concluded that primitive frogs learned to jump before they learned to land. The result being a rather inelegant crash landing. I’m sure the frogs were happy that they achieved their goal of getting from A to B, but I do feel sorry for them having to endure the presumably painful landing, not to mention the indignity of having the rest of the pond witness the proceedings!

What struck me about this story, aside from the obvious frog reference (albeit not purple!), was the parallel to Business Intelligence projects that we encounter time and time again. BI systems that are put in place, but whilst the project teams are basking in the glory of accomplishing their goal, often sporting a couple of bruises picked up along the way, the wider business looks on not really knowing what has just happened or understanding how the company or their department are going to benefit from this leap in technology.

Although BI, as both a business function and technical toolset, has long since reached maturity, its adoption within SMEs is still surprisingly in its youth. Most companies who embark on a BI project seem to do so either because they have been sold on a fantastic vision of the future of information management, or because they have a specific business need to solve. Whatever the reasons for initiating a BI project, there is often very little thought given to how this new information system is likely to impact the wider business, or how the system can be utilised to provide far more benefit than the original project scope called for.

There are two aspects to this impact, the pain and the gain.

The gain

When the company’s data is cleaned, consolidated and remodelled to make reporting and analysis more simple (one of the primary goals of a BI project), the tendency is to use that system to reproduce the company’s old static reports but in a slightly more glossy fashion, only faster and more accurate. This stops far short of what a good data warehouse or cube is capable of. Enter self-service BI.

Self-service BI essentially allows your employees to have a level of (secured) access to the underlying BI data, combined with the right tools to perform their own analysis. This enables them to not just consume their old static reports, but to pro-actively go digging for information, knowledge and a greater understanding of your company. Gone are the days when requiring a change to a report has to be documented, requested, specified, programmed, tested & released. A well designed data warehouse and cube, along with the right client tools (often just Excel), enables a user to just do it themselves with the bare minimum of training (if any), whilst retaining the confidence that the resulting report is accurate; the warehouse/cube take care of ensuring the underlying business logic is right.

The pain

A data warehouse is, by definition, a combined set of data from a variety of source systems. This means that any change to the source systems must be assessed for their impact on the warehouse. After putting a sparkly new BI infrastructure in place, you don’t want it to fall over because someone changed the structure of a source database or extract file! This adds a layer of change control that may not have existed before, and can cause some headaches.

There is usually a moderately large requirement for business users’ time, both to help the BI architect to understand and model the business properly, and also in testing.

Pain but no gain?

So, if the benefits such as self service BI (with dashboards, data mining, KPIs, PowerPivot etc. etc) are not adopted and utilised, the business is left having made an impressive technology leap, but is then stuck with the impact on source systems, a bill, and confusion from staff who don’t see the benefit of getting their old reports via a different method.

Evolution taught the frog that by landing properly, he can quickly jump again to go even further, and avoid the pain in the meantime.

SQL User Group Session 24 June 2010

I’m excited to be presenting another session to the South Wales SQL Server User Group.

On Thursday 24th June 2010, Eversheds in Cardiff are kindly hosting the event, to run from 18:45 to 21:00.
The event is free, and you’ll even get pizza thrown in – what more can you ask for? Oh yes, some BI content…

My session will cover data warehouse modelling, including a number of hands on business case studies including transactional data, account balances and duration based data.

Please feel free to bring your own data modelling problems along and I’ll try and cover as many as I can.

Register for free here: http://www.sqlserverfaq.com/events/235/Data-warehouse-design-case-studies-Other-BI-related-session-TBC.aspx

Hope to see you there!

SQL Server 2008 R2 – PowerPivot and Master Data Services

Purple Frog spent a very interesting day at Microsoft last week, at one of their many events promoting the launch of SQL Server 2008 R2. Rafal Lukewiecki presented an entertaining (as always!) and informative series of talks covering the release, focusing on the enhanced Business Intelligence tools available.

The primary changes to note are

  • Power Pivot – An in memory, client side add-in to Excel, that allows users to create virtual cubes on their desktop and analyse over 100m records of data virtually instantly
  • DAX – A new expression language, designed for non-technical (probably more realistically, semi-technical) users to extend pivot tables and power pivot tables without having to learn MDX
  • Report Components – In a report consisting of a couple of tables, a chart and a few gauges (gauges, sparklines & maps are all new features of SSRS), you can save each element as a component and re-use it in different reports. This should result in much less duplication of work.
  • Report Builder 3 – A thin-client tool allowing end users to create Reporting Services reports. This is a big enhancement over its predecessor s it is finally fully compatible with reports created in the Business Intelligence Development Studio (BIDS), including report components.
  • Master Data Services – A centralised tool and database intended to provide governance of your organisation’s master data (centralised list of products, fiscal calendar, regions etc.).

The enhancements to Reporting Services (SSRS) are very welcome, and should be of huge benefit to anyone either currently using SSRS or considering using it. I firmly believe that there are no comparable web based reporting engines that even come close for SME organisations when looking at the whole picture including cost of implementation, ease of use, flexibility and capability.

Master Data Services as a concept has been around for a long time, but there has never been a tool available to organisations to effectively implement it. This is Microsoft’s first proper stab at delivering a workable solution, and although I’m a big fan of the concept, and have no doubt of its benefit to a SME, I’m yet to be convinced that the tool is ready for a large scale corporate environment. Time will tell how scalable and manageable the system is, and credit has to go to Microsoft for starting the ball rolling.

The most impressive addition is without a doubt PowerPivot. In a nutshell, it’s a user defined OLAP cube wrapped up within Excel 2010, running entirely in memory on a user’s workstation. If you’ve not yet played with it or seen a demo, I’ll try and elaborate for you… Think about loading Excel with 1 million rows, and then imagine sorting and filtering a number of those columns [cue going out to lunch whilst waiting for Excel to catch up]. With PowerPivot, you can sort and filter over 100 million rows of data almost in an instant – it’s very impressive indeed!

That’s the snazzy demo bit, but to reduce it to a glorified spreadsheet is very harsh indeed. It allows a user to import multiple data sources and combine them together into a single dimensional data model, PowerPivot will create your own personal cube, without you having to build a warehouse, without knowing anything about MDX, dimension hierarchies, attribute relationships, granularity etc. etc.

Microsoft’s vision and reason for creating this tool is self-service BI, allowing users to create their own cubes, data analysis environments and reporting systems. And this is where I start to have a problem…

I can’t remember the last time I designed a data warehouse, where I did not find significant data quality problems, conflicting data, missing data, duplicated data etc.. I also find it hard to think of a situation where an end user (even a power user) is sufficiently clued up about the intricacies of a source OLTP database to be able to extract the right data and know what to do with it. Or if they are, a dozen other people in different departments have a different idea about how things work, resulting in many different versions of the truth.

I’m therefore (for now!) sticking with the opinion that it is still absolutely vital for an organisation to provide a clean, consistent, dimensionally modelled data warehouse as the basis for their BI/MI infrastructure. Tools like PowerPivot then sit very nicely on top to provide an incredibly powerful and beneficial user experience, but to try and use the emergence of self-service BI tools to usher in a new ‘non-data warehouse’ era is a very dangerous route which I hope people will avoid.

In summary – this release brings with it a fantastic host of new tools, but with great power comes great responsibility…

SQLBits and Microsoft BI conferences

Following the launch of SQL Server 2008 R2 this week, the ever dedicated SQL Bits team are putting on the next installment tomorrow with SQL Bits VI – ‘the Sixth Sets’. The Purple Froggers will of course be there, hopefully in time for a pre-conference bacon butty!

Also worth mentioning is ‘Microsoft Solutions for Business Intelligence’, a seminar that Microsoft are holding at their Reading campus on May 12 2010. Rafal Lukewiecki is speaking, anyone who has seen him before will know what an entertaining and knowlegable presenter he is. Whether you are currently implementing a BI solution or are just ‘BI curious’, a technical developer or a business manager, the day will provide an insight into the direction BI is heading in, and what we can expect in the years to come.

We hope to see you at either/both!

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Reiss McSporran
Jeet Kainth
Jon Fletcher
Nick Edwards

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon