0845 643 64 63

Alex Whittles

SQL Server 2016 Licensing Changes, SP1, v.Next CTP1, etc.

connect2016By now you’ve probably heard the exciting news announced at today’s Microsoft Connect conference; SQL Server licensing is changing, significantly, from SQL 2016 SP1.

The functionality of SQL Server Standard, Web & Express editions are being brought up to [almost, 99%] match that of Enterprise Edition. This has a number of critical impacts:

  1. Developers can write applications using a single set of SQL Server functionality, and can now make use of previously Enterprise only features without imposing the cost on their customers.
  2. The customer now decides whether to use Express, Web, Std or Ent based on scaleability, not functionality.
  3. It follows the pricing model already used by most cloud services; pay for usage rather than features.
  4. This brings security to the masses – possibly one of the biggest impacts is that now we can use Row Level Security, Dynamic Data Masking and Always Encrypted in ALL editions of the engine – huge news.

sqlserverThere are some limitations to this though:

  1. It currently only applies to the SQL engine, not to SSIS, SSAS or SSRS – Please Microsoft, follow suit with the BI tools asap!!!
  2. Some features are not [yet] available in the Express and/or Local DB editions, including Change Data Capture, In Memory OLTP and Polybase to name a few.
  3. The trigger to move to Enterprise will now be [almost] purely driven by size and performance requirements.

As if that isn’t enough, other huge announcements include:

  • SQL Server 2016 SP1 is now available
  • New functionality in 2016 SP1, including
    • CREATE OR ALTER functionality for Stored Procs, Views, Functions, etc.
    • DBCC CLONEDATABASE – To easily clone a database schema without the data (love this!!)
    • NUMA support for Analysis Services Tabular – I’ll be checking this out in more detail soon and will blog my findings
  • SQL Server v.Next CTP1 (Community Technology Preview) is now available, which includes SQL Server running on Linux! Note that only the engine is available on Linux at this time, not the BI tools but watch this space, the plan is to offer matching functionality on both

 

For full details, check out the Microsoft blog post here

/Frog-Blog-Out

What is U-SQL?

By now you may have heard about U-SQL, the new kid on the query language block. But what is U-SQL? Where is it? What’s it for?

I was lucky enough to be at the 2015 MVP Summit in Redmond, at which one of the sessions was hosted by Michael Rys (@MikeDoesBigData), talking about U-SQL. As it’s creator, there’s no-one better to learn the ropes from. I was pretty much blown away by what it can do and the ease of access, so I’ve spent the last couple of months playing around with it, and doing a few talks at conferences about it.

But there’s a lot of confusion over what it is, so here’s my high level summary of what it is and where it fits, so you can judge whether it’s relevant to you.
U-SQL

What Does U-SQL Look Like?

It’s a hybrid of T-SQL and C#. Think if it as writing a SQL query, but being able to embed the power of C# and the .Net framework.

So you could write a query that looks like this:

1
2
3
4
5
@CustomersClean =
  SELECT *
     ,((CustomerGender=="M" || CustomerGender=="F") ? CustomerGender : "-") AS CustomerGender_Clean,
     ,MyUSQLApp.Udfs.ValidateDate(CustomerDoB) AS CustomerDoB_Clean
  FROM @Customers;

So the basic SELECT * FROM Table syntax is familiar to anyone used to T-SQL. But with the addition of some C# to clean up the Gender and DoB columns. For the DoB, we’re calling a user defined c# function (ValidateDate) that sits in an assembly behind the query.

How do I  use U-SQL?

Data Lake U-SQL is a part of Azure Data Lake. Data Lake is a core component in the Cortana Analytics Suite, Microsoft’s cloud/Azure data platform. Think of Data Lake Storage as a massively scaleable and high performance staging and permanent storage area for all of your structured, semi-structured and non-structured data, based on the Hadoop File System (HDFS). It can store both tables of data and files (csv, text, images, whatever you want to store).

Once your data is held in Data Lake Storage, you can use Data Lake Analytics to do cool stuff with it. This could include ‘Big Data’ processing and analytics, i.e. running large and complex parallel analytics queries over your data, whether that’s MB, GB, TB or even PB. Or it could be transforming and manipulating your data into different structures.

Either way, Data Lake Analytics uses U-SQL as it’s language. At the moment this is the only place that U-SQL exists, but given its power and flexibility I’m hoping it will also have a bright future outside of Azure Data Lake.

You can write and execute U-SQL directly in the Azure Portal, or for a much richer development environment you can install some plugins into Visual Studio 2013 or 2015. I recommend the latter.

Download the Azure SDK for .Net (2.7.2+) here

Download the Data Lake Tools for Visual Studio 2013 or 2015 here

So it’s a Big Data tool?

Yes, and no.

Yes it has the scalability, performance and functionality for Big Data, and this is what it’s designed to do. When you write a simple U-SQL script, the Data Lake Analytics engine takes the query and parallelises it, scaling it up to as many nodes as you want. So it’s scalable. Very scalable. It’s being used now for complex big data analytics jobs with TB of data, and can scale way beyond that.

But it’s more than this, I also see it as a data preparation and transformation tool. If you want an Azure based BI solution, maybe using Azure SQL DW, or even a smaller scale system using Azure SQL DB, then we don’t yet have a PAAS ETL tool in Azure. But now we do! This provides a very powerful and comprehensive platform for all your data transform needs.

What about Azure Data Factory (ADF)? – No, ADF an ETL tool, it’s an orchestration and coordination tool, and isn’t realistic as a data transformation tool.

What about ELT? – Yes that’s fair, to a point. But in my experience the ELT approach actually needs to be ETLT to work properly. The 2nd ‘T’ is taken care of inside the SQL engine, but a lot of source data has to be pre-processed before it can even be landed in a SQL Server table, U-SQL provides that power to us that T-SQL cannot provide.

Summary

I’m not going to go into any more detail here with how to use or write U-SQL, there will be more posts to come on that.

But I hope this provides some clarity on what it is and where it fits in our ever expanding data platform world.

Data Lake went into public preview in October 2015, so it’s ready and waiting for you to have a go!

 

Frog-blog-out

 

PASS Summit and MVP Summit 2015

I’m currently sitting in Seattle-Tacoma airport waiting for BA048 to take me home, and coming to terms with what an amazing couple of weeks I’ve been fortunate to have.

The PASS Summit

SummitBadgeI had the honour of being selected to speak at the PASS Summit last week, the biggest SQL Server conference in the World, with I believe in the region of 5,500 attendees. I’ve been a regular speaker at lots of the European events (SQL Bits, SQL Relay, SQL Saturday, SQL Server Days, user groups, etc.) for years and I love speaking at them all. But this was different. The sheer scale of it, the enthusiasm and interaction of the audience, the chance to visit Seattle for the first time, and the parties. Wow the parties. It’s going to take me some time to sober up after this.

IMG_4960If you want to get a feel for the size, this is a photo of the keynote on day 1. It’s big.

What’s important to note is that during the day the focus is on learning, and there are a very large number of sessions to choose from. At night it’s a chance to network, to meet old friends and new, to talk shop but also to get to know people better. Don’t underestimate the value in going to the parties

summitpartying

My session went very well, with some great feedback and questions from the audience, thanks to all those that came and were part of my first Summit talk. I hope you left with a renewed love for MDX, and I hope you enjoyed it even half as much as I did – I had a blast!

The MVP Summit

On October 1st, something awesome happened – I was made a SQL Server MVP. Having spent my professional life looking up to and learning from the amazing group of MVPs we have in the UK and worldwide, it was a thrilling, humbling and daunting prospect to be invited to join their lofty ranks. I’ve been meaning to blog about it since then but haven’t been able to articulate or even understand what this actually means, apart from a cool blue badge.

MVPBuzzAll of that changed this week at the MVP Summit in Redmond. As I was already heading out to Seattle for the PASS Summit, it was easy to extend my stay by a week, and boy am I glad that I did.

I can’t talk about the technical content due to NDAs etc., but I will do my best to describe what it’s like.

Do you remember the first time you met a SQL Superstar? I do, it was at SQL Bits and I bumped into Jamie Thompson of SSIS guru fame. I didn’t even recognise him as he didn’t resemble his twitter profile pic at the time, but after a short chat I carried on up the stairs with a huge grin on my face, only to walk into a room with Allan Mitchell, Simon Sabin and countless others. I sat there for a while just soaking it all up.

Walking into the MVP Summit this week brought it all back, a thousand times over. I’m sat in a room next to Ola Hallengren, Paul Randel, Kimberly Tripp, Stacia Misner, Itzik Ben-gan, Brent Ozar, JRJ, Kevin Kline (and the list goes on and on and on). Listening to the likes of Joseph Sirosh, Shawn Bice, Mark Souza, etc. – the decision makers of the whole data platform. Then technical sessions from the likes of Michael Rys (U-SQL inventor) and Kasper de Jonge (SSAS), etc.. Not only are they presenting, but they’re asking; asking for input, asking for feedback, asking for details of what we see in the field, what we need in the product suite. I really felt that the whole team opened up and wanted a two way discussion. There’s a reason that SQL Server now tops the Gartner magic quadrant, and listening to these folks talk it’s easy to understand why; their vision and passion is inspiring.

IMG_5085And then, yes, more beers in the evening to continue the discussions.

More discussions on the product suite, discussions on sport, life, beer, cars, etc. etc. Getting to know each other better, whilst continuously learning. More and more learning.

I have to say a huge thank you to the whole product team for giving us so much time, and putting up with all our questions. It was an invaluable experience that I’ll never forget.

Also a big thanks to the other MVPs at the summit for being so welcoming. Especially Chris Testa-O’Neill, for being my guide for the week and helping it feel less daunting!

IMG_5072But the biggest thank you must go to Jennifer Moser for coordinating and planning it all so well. Not just the summit, but the activities around it; the scheduling, the transport, the pub crawl, access to the product group, and of course the inaugural MVPs Vs Microsoft American Football match.

Yes that’s right. On the Saturday before the summit, we braved the elements and headed to a local sports field for a game of flag football. Or at least ‘intentional’ flag football, but Mark Souza’s two broken ribs (and Simon Sabin’s subsequent reallocation to the Access MVP programme) are evidence that it was not entirely as contact-less as it should have been!

Mark’s team well and truly beat us, but we put up a valiant fight. Secretly we think that Mark had employed a few college football pros onto the SQL Server team just for this match. I couldn’t walk for a few days after, but we all had an awesome time.

What next?

My brain is well and truly fried. I need time to digest it all, and then I’ll be writing a few blog posts about what I can share.

But for now I’m just hoping to get some much needed sleep on the overnight flight back to Heathrow, and really looking forward to seeing my wife, Hollie, who has been amazing in supporting me and enabling me to get out here. Thanks Hol x.

So how would I sum it all up? I’m tired, drained, jet-lagged, brain-fried, hungover and my feet are falling off. But I loved every second of it.

Frog-Blog-Out

MDX Compare or Rank Similar Members

Or should this be called: “Recreating DAX’s EARLIER function in MDX” – either way, a useful technique that solves a problem…

MDX makes it very easy for us to compare one member against others, using functions such as RANK() etc. But how do we dynamically compare a member against a subset of other members? I came across a customer requirement recently where we had to rank a member against other members that had similar properties, based on other measures. This will probably make more sense with an example…

Taking AdventureWorks, we can easily calculate the sales rank of each product using something like this:

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,[Product].[Product].[Product].MEMBERS
       ,[Measures].[Internet Sales Amount])
SELECT {[Measures].[Internet Sales Amount], [Measures].[Rank]} ON 0
      ,NONEMPTY({[Product].[Product].[Product].MEMBERS}
                ,[Measures].[Internet Sales Amount]) ON 1
FROM [Adventure Works]
WHERE [Date].[Calendar Year].&[2008]

MDXRank1

This ranks each product against all other products. But is that a fair comparison? Comparing sales of a water bottle against a top of the range racing bike. Not really. So how about we instead rank each product against all products within +/-20% of the same cost. So if the water bottle costs £20, then we would rank its sales against all products with a cost between £16 and £24. This gives a more accurate idea of how well each product is performing compared to its peers.

Although to keep the MDX simple here, let’s just say any product within £20.

In AdventureWorks we have [Measures].[Internet Average Unit Price], which can be used to determine comparable products. So how do we go about achieving this?

If we look at the RANK function, it takes three parameters; the member being ranked, the set over which to rank, and the measure to be used.

All we have to do is filter the second parameter, the set over which to rank, to include similar members. So maybe something like this:

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,FILTER([Product].[Product].[Product].MEMBERS
              ,ABS([Measures].[Internet Average Unit Price]
               -([Measures].[Internet Average Unit Price]
                ,[Product].[Product].CURRENTMEMBER))
               <=20
               )
       ,[Measures].[Internet Sales Amount])

If we break this down, we’re just changing the 2nd parameter to be a filtered set, where the unit price is within £20 of the unit price of the current member. This should work right?

Unfortunately, wrong. The results look exactly the same as the original rank – nothing has changed.

MDXRank2

The problem here is that CURRENTMEMBER is within the filter function, so it changes context to refer to whatever row is being considered at the time by the filter function. So [Measures].[Internet Average Unit Price] and ([Measures].[Internet Average Unit Price],[Product].[Product].CURRENTMEMBER) are always the same product, and no rows are filtered out. CURRENTMEMBER does NOT refer to the current member being considered by the RANK function, but by the FILTER function.

In DAX we have the EARLIER and EARLIEST functions, which would be great here, and would allow us to step out of the current context into the previous calculation layer. But unfortunately we haven’t been blessed with an MDX EARLIER function. So how do we fix this in MDX?

The trick here is dynamic sets, using the STRTOSET function. This allows us to grab the member being ranked, and treat it as a fixed member within the FILTER function.

WITH MEMBER [Measures].[Rank] AS
   RANK([Product].[Product].CURRENTMEMBER
       ,STRTOSET('
          FILTER([Product].[Product].[Product].MEMBERS
                ,ABS([Measures].[Internet Average Unit Price]
                 -' + CSTR([Measures].[Internet Average Unit Price]) + ')
                 <=20
                )'
       )
       ,[Measures].[Internet Sales Amount])

We building up a string, which will fix the value of [Measures].[Internet Average Unit Price] to that of the product being ranked, and will then dynamically compare it to the value of [Measures].[Internet Average Unit Price] for all other products. Those within £20 will be included in the resulting set, and will be used to rank the original product.

MDXRankYou can see the result in the screenshot above, where the ranking is dependent on the average unit price.

Frog-Blog Out

Excel crashes when enabling macros

I’ve had a problem with Excel 2013 crashing a number of times when opening/editing spreadsheets with macros. When I click ‘Enable Content’ to enable macros, the Excel window fades out, goes unresponsive and then crashes with a glorious “Microsoft Excel has stopped working, Windows is checking for a solution to the problem” message. Unfortunately Windows never seems to be able to find the elusive solution that we need.

excelmacrocrash1excelmacrocrash2

I’ve found that the spreadsheets tend to stay working whilst you’re looking at them (if you don’t click ‘Enable Content’), however it crashes when you try and do anything useful such as copy/paste, etc. Sometimes Excel just crashes when opening the file, even without enabling macros. This is seemingly caused by a corrupt compile of the macros. Yes I know it’s VBA code which is interpreted and not compiled, but I’m over-simplifying here.

There a a couple of solutions to this:

1) Before enabling the macros try adding a new blank sheet, and then save the spreadsheet, reload, and sometimes it corrects itself. But not always.

2) Recompile the macro

  • Open the Excel spreadsheet (but don’t enable macros yet)
  • In the ‘Developer’ ribbon, click ‘Visual Basic’

excelmacrocrash3

  • On the ‘Tools’ menu, click ‘Options’
  • Then on the ‘General’ tab, untick ‘Compile On Demand’, then click OK

excelmacrocrash4

  • Then scroll down in the ‘Project’ window, right click on ‘Modules’, then Insert a new module

excelmacrocrash5

 

  • Then just close the Visual Basic window, save the spreadsheet and close Excel.
  • Hey presto, when you then load your spreadsheet you’ll be able to enable macros, and Excel won’t crash.

Thanks to Paulina from Microsoft for helping me find this solution – it’s saved my bacon a few times now!

Frog-Blog-Out

First thoughts on DataZen

Following the recent announcement of Microsoft acquiring DataZen, I’ve been having a play around to see what it can and can’t do. Here’s a very brief summary so far:

The Good:

  • Very quick and easy design interface
  • Fantastic way of modifying dashboard layout for tablets and mobile devices
  • It seems to ‘just work’ very well
  • You don’t have to worry about the pixel perfect layout of what’s where, it takes care of it for you

Screen shots of the same dashboard running on an iPhone 6 and in a web browser:

datazen01
datazen02

The Bad:

  • The designer is Windows 8 only. Really?!
  • Yes it technically connects to SSAS cubes, but the interface is quite frankly no more than smoke and mirrors.  It connects in the same way as QlikView does; it seems that you have to write an MDX query to return an entire fact table at whatever level of granularity you want to be able to filter by. DataZen then pulls back all of the data and then re-aggregates this for your dashboard. This may be ok for a small cube, but with a large measure group that needs to be sliced/diced by a number of different attributes at the same time this quickly becomes a bad idea. It is not able to use the power of SSAS, it treats it as simply another flat data source.
  • Scalability seems to be limited. For example you can provide a flat dataset containing a key and a parent key, and DataZen will turn it into a tree filter for you. Except that I tried this with a 7000 node parent child hierarchy in a tree and it just couldn’t cope. It did eventually load after numerous attempts, but it was so unresponsive that it was unusable. It seemed to work ok with a few hundred nodes.
  • You don’t have to worry about the pixel perfect layout of what’s where, it takes care of it for you. Yes this is a good point, but it also unfortunately means that there is little scope for customisation. This gets frustrating when it doesn’t ‘just work’. For example resizing a chart with a legend, in some sizes on the mobile view the legend was so big it left no space for the chart. It would be nice to be able to turn off the legend for the mobile view or something similar.
  • SSAS calculation logic is not supported. Well, this is an extension of the smoke and mirrors SSAS implementation, but it’s particularly relevant in this point. One of our existing clients is a heavy dashboard user, with data sourced from SSAS multidimensional cubes. They have a number of KPIs defined, for which the target is the actual value of the previous period. Now SSAS takes care of this beautifully. If the user selects a month then the target is the previous month’s value. If they select a week then the target is the previous week’s value, etc. However DataZen provides SSAS with no context over what is selected, and so SSAS is not able to dynamically do its magic. Therefore KPI targets cannot be dynamic, they need to be static and fixed at the point of DataZen data refresh.
  • Dates from MDX queries don’t seem to want to hook into the Time Navigator filters, and there’s no way of forcing them to. Although I’m guessing that this is a problem with how I’m doing it, so probably not fair to include it here.

Summary

I’ve not been playing with the tool for very long, so I may find ways around all of this. However at the moment I’m concerned that Microsoft are placing their dashboard/analytics future on a product that doesn’t properly support SSAS. In my mind the core strengths of the MS Business Intelligence offering is underpinned by the power of cubes, backed up with SSIS and a the strength of the SQL Server database platform. To buy into a dashboarding platform that doesn’t support and build on this is a cause for concern.

My hope/expectation is that Microsoft take elements from PowerView (that does properly support realtime queries against a cube) and elements from ProClarity/PerformancePoint (e.g. the decomposition tree, etc.), and embed them into DataZen. In which case they could end up with an awesome product.

Yes I’m a cube guy, so I’m naturally focused on the poor SSAS integration. If you don’t use cubes then you’ll probably love it. But MS have some work to do to bring me around.

Anyway, time will tell what happens next… Lets keep fingers crossed

Frog-Blog-Out

Excel creates inefficient MDX

Whilst investigating a slow set of Excel pivot tables (connected to a multidimensional SSAS cube), I found some odd behaviour in how Excel generates its MDX, sometimes doing far more work than if required and causing significant performance reduction.

Take the following example, just querying Customer Count by Promotion, against the AdventureWorks cube.

ExcelMDX01

The profile trace (just using “Query Begin”, “Query End” and “Query Subcube Verbose”) shows that the query runs two “Query Subcube Verbose” calls, which means two separate calls to the SSAS storage engine.

The first Query Subcube event requests “0” for Promotion, which is the [All] member, and so is used as the total in the pivot table. The second queries “*”, which returns every member in the Promotion attribute, which in this example returns values for each of the 4 different promotions.

This makes sense, as for a distinct count measure, you can’t simply add up the components, you do need to calculate the subtotal or grand total separately. The query just takes twice as long as a result.

So lets say that the total is irrelevant to the user, so they disable totals within Excel…

Then we refresh the pivot, we should only see a single storage engine call. Right? Wrong.

ExcelMDX04There are still two calls to the storage engine, and SSAS is still calculating the grand total for the query, which Excel is then just discarding.

In a small cube this may make very little noticeable difference to the pivot performance. But on a multi-terrabyte cube with billions of records, distinct count operations can take minutes to run. And calculating the non-required totals can make a significant difference.

This becomes even worse when multiple attributes are stacked on the rows and/or columns. Each new attribute requires its own totals calculating, resulting in more and more calls to the storage engine. In the following pivot I’ve added an extra two attributes to the rows, with no subtotals or grand totals.

ExcelMDX05The following trace shows that 8 Subcube queries were performed.

ExcelMDX06These correspond to detail and totals for each combination of attributes, which calculates as 2^n storage queries, where n is the number of attributes selected on the rows/columns of a pivot.

Now it’s common practice for users to want to create pivots with 4, 5, 6+ attributes on rows/columns, and the workload on the cube increases exponentially with each one they add. This is not good!

There are two workarounds to this:

1) Forcibly disable totals for relevant dimensions in the cube.

2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.

 

Let’s look at these in more detail…

1) Forcibly disable totals for relevant dimensions in the cube.

Let’s say that for a particular cube, customer counts were only ever calculated on the monthly level, and it didn’t make sense to aggregate customer counts up to a total across all months/years. We could just add some MDX into the Calculation script to disable date totals for the relevant measure(s).

SCOPE([Measures].[Customer Count], [Date].[Calendar].[All]);
      THIS=null;
END SCOPE;

Excel will still produce an MDX query asking for the date totals, but SSAS will just ignore it and return null. Note that this obviously only works where there is no business sense in asking for the total.

 

2) Use Excel 2013 (or above), and hard code the set of rows to exclude totals.

Excel 2013 introduced an ability to customise the MDX of a pivot. On the Analyze ribbon, click ‘Fields, Items & Sets’, and then ‘Create Set Based on Row Items…’ExcelMDX07You can then remove any totals or rows that you don’t want, including the subtotals and grand total.

ExcelMDX08Or if you’re an MDX guru, you can click on ‘Edit MDX’ and write your own sets for even more performance, and also to make it more dynamic.

When we run this, we get the same output in Excel, but the profile trace now shows:

ExcelMDX09Just a single call to the storage engine, instead of 16. 1/16 of the work for the same results.

Now beware that once you do this, you’re essentially hard coding that part of the pivot, so in this example any new promotions would not automatically show in the pivot unless they were added. That;s why learning some MDX and writing a more dynamic set is preferable.

But the obvious answer to this is, please, please Microsoft, fix the terrible MDX that Excel creates! If you agree, please upvote the Connect item requesting that it be improved.

Frog-Blog Out

 

Microsoft Acquires Datazen

Exciting news in the Microsoft Analytics space. On 14th April 2015 Microsoft announced they are acquiring Datazen – what does this mean, and is it a good or a bad thing?

For all of PowerView’s good points, it’s still a very immature product. I’ve still not had a single customer take it seriously, aside from a few proof of concept implementations. It’s a good ‘personal analytics’ tool, but in my opinion it doesn’t yet hit the mark when it comes to an enterprise grade analytics front end.

PerformancePoint is still the only MS tool that offers a traditional dashboarding interface, and quite frankly it’s well past its best before date. With almost zero development in recent years it has been left behind as a clunky relic of days gone by. Although one that is still in surprisingly widespread use through no reason other than lack of alternative MS options.

With great 3rd party tools like Pyramid Analytics and Tableau out there, Microsoft have really dropped the ball on this. We’ve all been waiting for PowerView to mature and grow, but now Microsoft has announced that they’re acquiring Datazen, which could be a game changer.

I’ve not used the tool before, but I’ll be downloading it asap and will report back with findings in due course. It ticks a number of boxes; multiple data sources including OLAP, SQL, etc, mobile friendly (apparently), interactivity, KPIs, maps, etc.

The key question for me is how will Microsoft approach SSAS cube development with Datazen. Will they maintain and develop full support for multidimensional cubes, or will they prioritise tabular cube functionality. Time will tell.

New Azure User Group in Birmingham

AzureBhamLogoWe’re launching a new Azure user group in Birmingham, UK! The first session is on Thursday 16th April, 6.30pm at the MAC (Midlands Art Centre) next to Edgbaston Cricket Ground.

The first session will feature two of the most prominent Azure MVPs in the UK; Richard Conway (Blog | Twitter) and Andy Cross (Blog | Twitter).

With Cloud based solutions becoming ever more prevalent, focusing on providing free Azure training and information for the local technical community is critical to the growth and skill base in the area.

It will run alongside the existing SQL Server User Group, and will follow a similar format; casual and informal, with two x 1 hour presentations from technical experts, and a light snack and time for networking in the middle.

The SQL Server User Group will run in odd months, with the Azure group running in Even months.

A huge thank you to Richard and Andy, for not only speaking at our first event, but also for helping get the group up and running.

For further details, or to register, check out:

Birmingham Azure User Group

Birmingham SQL Server User Group

 

BIML – What is it?

I’ve noticed a growing trend over the last year – the ever growing presence of BIML (Business Intelligence Markup Language). So what is it? What does it do? And do you need to learn it?

What is BIML?

Simply, it’s a way of defining the functionality of an SSIS (Integration Services) package. If you’ve ever opened an SSIS .dtsx file in notepad you’ll see a daunting mess of GUIDs that you really don’t want to play around with. BIML is a simple XML format that allows you to write an SSIS package in notepad. When you run a BIML script it creates SSIS packages for you. These can then be opened and edited in BIDS exactly the same as an SSIS package that you’d created manually.

To show the difference, first of all this is a sample BIML script:

BIML

Then, when this is compiled into an SSIS package it looks like this in the front end:

BIML Resulting Package

But this when you open the .dtsx package in notepad:

SSIS

The BIML script is a little easier to digest!

But Why?

But why on earth would you want to do that, when you can just use the BIDS/Visual Studio GUI? The answer is C# and automation. You can mix C# code in with the BIML XML (in a similar way to PHP or old school ASP scripts). This allows you to have a single BIML script, which can apply itself to every item in a list, or every table in a database, and automatically generate all of your SSIS packages from a single template.

Yes, this is very cool stuff.

The following screenshot is the same script as above, but configured to loop through every table in the ‘dim’ schema of a data warehouse, creating a package that truncates the relevant dim table.

The C# script is highlighted in yellow for clarity.

With this, just running the script will create multiple SSIS packages at the click of a button.

How do you create and run a script?

Firstly you need BIDS Helper. But you should have that anyway.

Create a new Integration Services project, then right click on the project and click ‘Add New Biml File’

BIML5

This will add a BIML script file into the Miscellaneous folder of the project.

Once you’ve written a script you can test it (right click on the script and select ‘Check Biml for Errors’, or you can run the script, generating the SSIS packages, by clicking ‘Generate SSIS Packages’.

So, do you need to learn BIML?

I have no doubt that BIML is the future of SSIS. Once you see the full power of if then you’ll never want to go back to manually coding packages again.

If you’re an SSIS pro then there’s a good chance that your next job will require BIML. Or if a potential employer doesn’t ask for it, you can certainly improve your chances of getting the job by selling it (and your skills) to them.

At Purple Frog, all of our SSIS development is now 90% automated using BIML, leaving us more time to focus on the 10% of work that need some custom tweaking or more enhanced logic.

What if you don’t like coding?

Well in that case, check out MIST from Varigence. It’s a GUI for BIML, and a lot more besides. If you’re going to be using BIML a lot then it may well be worth the investment.

<Frog-Blog Out>

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Lewis Prince
Reiss McSporran
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon