0845 643 64 63

Monthly Archives: October 2021

Automate changing SSIS connection at runtime

Recently a customer came to us with an issue: “We have a primary/secondary AlwaysOn failover cluster, we want the data warehouse ETL to always pull from the secondary read-only replica. When the primary fails over, how can we detect this and repoint the ETL to the new secondary?”

This led us to the following qualification steps:
1. Is the secondary server accessible?
a. Yes – Use it.
b. No – Try the primary server.
2. Is the primary server accessible?
a. Yes – Use it.
b. No – Fail SQL Agent Job.

This sounds simple enough, but how do we handle this in a SQL Agent Job? Using the built-in SSISDB Store Procedures! (Links at the end for further reading on the Stored Procedures used).

Step 0: Ensure that your Connection has a parameterised option for the ServerName

Step 1: Set the SQL connection manager to the SECONDARY server. (This is done to always try this server first).

DECLARE @var sql_variant = N’1.1.1.1′ –Replace 1.1.1.1 with your SECONDARY destination IP/Servername
EXEC [SSISDB].[catalog].[set_object_parameter_value]
@object_type=20,
@parameter_name=N’CM.XXX.ServerName’, –Replace XXX with the name of your connection
@object_name=N’XXX’, –Replace XXX with the name of your SSIS Package or Project for the level of connection
@folder_name=N’XXX’, –Replace XXX with the name of the folder your SSIS Project is deployed into
@project_name=N’XXX’, –Replace XXX with the name of your SSIS Project
@value_type=V,
@parameter_value=@var
GO

Step 2: Check the connection can access the database. (This can be any query with a pass/fail. In this instance, we want to check whether the database is a writeable replica, but this could be a simple query that will fail if not available).

IF EXISTS(SELECT 1 FROM master.sys.databases WHERE [name] = ‘XXX’) –Replace XXX with the database name
BEGIN
IF NOT (DATABASEPROPERTYEX(‘XXX’, ‘Updateability’) <> ‘READ_WRITE’) –Replace XXX with the database name
THROW 51000, ‘This is not a writeable replica’, 1;
END

Step 3: Set the SQL connection manager to the PRIMARY server.

DECLARE @var sql_variant = N’1.1.1.1′ –Replace 1.1.1.1 with your PRIMARY destination IP/Servername
EXEC [SSISDB].[catalog].[set_object_parameter_value]
@object_type=20,
@parameter_name=N’CM.XXX.ServerName’, –Replace XXX with the name of your connection
@object_name=N’XXX’, –Replace XXX with the name of your SSIS Package or Project for the level of connection
@folder_name=N’XXX’, –Replace XXX with the name of the folder your SSIS Project is deployed into
@project_name=N’XXX’, –Replace XXX with the name of your SSIS Project
@value_type=V,
@parameter_value=@var
GO

Step 4: Check the connection can access the database. (Again this primary server should now be a writeable replica in our case).

IF EXISTS(SELECT 1 FROM master.sys.databases WHERE [name] = ‘XXX’) –Replace XXX with the database name
BEGIN
IF NOT (DATABASEPROPERTYEX(‘XXX’, ‘Updateability’) <> ‘READ_WRITE’) –Replace XXX with the database name
THROW 51000, ‘This is not a writeable replica’, 1;
END

Step 5: Email to say primary server used for information.

DECLARE @EmailBody NVARCHAR(1024)
SET @EmailBody = CONVERT(varchar(10), GETDATE(), 103) + ‘ – the ETL was pointed to the primary version when the server check was run’

EXEC msdb.dbo.sp_send_dbmail
@profile_name = ‘XXX’, –Replace XXX with the name of your SQL mail Profile
@recipients = ‘XXX’, –Replace XXX with the list of email addresses you’d like to notify
@body = @EmailBody,
@subject = ‘XXX’ ; –Replace XXX with the desired email title

Step 6: Continue SQL Agent job as normal

The flow for this would then be as follows:

STEP 1
Success – Go to Step 2 (Connection set successfully, so test the connection).
Failure – Fail Job

STEP 2
Success – Go to Step 6 (Successfully used the secondary replica, so continue with the SQL Agent Job).
Failure – Go to Step 3 (Unable to use the replica, so try the primary).

STEP 3
Success – Go to Step 4 (Connection set successfully, so test the connection).
Failure – Fail Job

STEP 4
Success – Go to Step 5 (Successfully used the primary, so send notification email).
Failure – Fail Job

STEP 5
Success – Go to Step 6 (Continue SQL Agent Job).
Failure – Go to Step 6 (Because if the email notification fails, we still want to run the rest of the SQL Agent Job).

Microsoft Docs for: set_object_parameter_value

Microsoft Docs for: sp_send_dbmail

Any questions? Let me know in the comments below or contact me on twitter: @PurpleFrogReiss

Replicating SQL windowing functions in Power Query.

I’m sure you’ve all heard of windowing functions in SQL Server? But what if you need to replicate these windowing functions in Power Query?

Here I have a sample CarSales.csv dataset of car sales per brand, per year, per month and the number of units sold. I will use this dataset to demonstrate how we can replicate the windowing LAG function in Power Query.

In my example I’d like to calculate the month on month sales growth per car brand, per year. Using SQL this would be relatively simple using the LAG function to access the previous row value and then simply subtract the output of the LAG function from the column unit_sold with the use of a CTE.

Which would then generate the output below.

However, what if you found yourself in the situation where you couldn’t use SQL? Perhaps you’ve connected to an Analysis Cube or perhaps just a .CSV file in an Azure Storage Account. What do you do then…?

I’ll now show you how you use Power Query to help you solve your problem!

To start open Power BI Desktop and click “Home” > “Get Data” > “Text/CSV” and then select and open your .csv file. In my case it was CarSales.csv.

Now instead of just opening the data, you’ll want to click the option to “Transform Data” which will then open Power Query.

Once Power Query is open you’ll then need to sort the columns you want to partition by and order by. In my case it was “brand”, “year” and “month” which I sorted in ascending order. To do this click on the small down arrow to the right of each of the columns and click “Sort Ascending”.

This then produces the output/preview shown below.

The next step is to add an index column. To do this click on “Add Column” in the ribbon, click “Index Column” and then click “From 0”. This will then create a new column called “Index” which will increment by 1 starting from 0.

Once done you’ll then need to create a new column which will add 1 to this “Index” column. To do this click on the “Index” column to select it and then click on “Add Column” from the ribbon, then click “Standard” and finally click “Add”.

This will then launch a new window where you’ll be able to specify the value to add. For this example as we just want to access the previous row to get the previous months sales data we’ll enter the value 1. This step essentially replicates the [offset] argument in the LAG function. If instead you wanted to replicate the LEAD function you could instead use the “Subtract” option to subtract 1 away from this index column.

Once done, click “OK” and this will create a new column called “Addition”. To make things clearer for the purpose of this blog we’ll rename this to “Offset_Index”.

Next using the newly created “Index” and “Offset_Index” columns we want to left join this table back to itself. To do this click “Home” in the ribbon and then click “Merge Queries”.

This will then open a new window. You now need to select the 2 columns you want to partition by plus the index column. This will form the basis of your join criteria. To do this control + click the columns you want to join on. In my example these are “brand”, “year” and “index”.

Next in the empty drop down select the same table as the table you want to merge to. It will have “(Current)” tagged onto the end of the name. Now ensure that the join kind is set to “Left Outer”.

Next, control + click the 2 columns you want to partition in the new table plus the “Offset_Index” column we created earlier. In my case these were “brand”, “year” and “Offset_Index”.

You’ll now see the following join logic all joined by an “and” condition:

  • “brand” from “CarSales” left joins to “brand” from “CarSales (Current)” indicated by the “1” next to the column heading each of the columns.
  • “year” from “CarSales” left joins to “year” from “CarSales (Current)” indicated by the “2” next to the column heading each of the columns.
  • “Index” from “CarSales” left joins to “Offset_Index” from “CarSales (Current)” indicated by the “3” next to the column heading each of the columns.

Next click OK and this will create a new column called “Renamed Columns” (which was just the name from the previous power query step) but essentially contains all of the columns from the newly joined (or merged) table.

Next expanded the “Renamed Columns” column by clicking the double-sided arrow and select the column you want to access from the previous row. In my case it was the “units_sold” column from the “CarSales (Current)” table.

Now you’ll want to rename this column to something that’s more meaningful to the context of your data. In my example I renamed this column to “units_sold_previous_month”.

Finally to solve the original problem we just needed to calculate the sales growth per month, per brand, per year. To do this click “Add Column” in the ribbon and then click “Custom Column”. When the new window opens call the new column something meaningful to your data and enter a custom formula to solve your problem. In my example I called this column “sales_growth” and entered a custom formula of [units_sold] – [units_sold_previous_month].

This then created a new column called “sales_growth”.

Finally the 3 newly created columns we created in the steps earlier can now be deleted. In my example these were “Index”, “Offset_Index” and “units_sold_previous_month”. To do this control + click the columns you want to remove, right click and then left click “remove columns”.

And that’s it! These steps have successfully replicated the LAG function in Power Query!

I hope this step by step guides helps your solve your windowing function needs in Power Query.

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Reiss McSporran
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon