0845 643 64 63

SSIS

Execute SQL Task in ADF

If you’re new to ADF or Synapse pipelines and looking for a way to execute SQL commands in pipelines, unfortunately there is no equivalent to the Execute SQL Task in SSIS, but I have found some alternatives which work well for me.

For something as simple as a SELECT statement where you want to return results to be used later, you can use a Lookup activity:

You can then reference the output of this activity in other activities using dynamic content such as @activity(‘LookUp’).output.firstRow.DateKey.

Although you can also use the Lookup activity for DML commands, it is not designed for this and so I’ve found another way to do this. First, create the following stored procedure in the database:

This stored procedure will execute the SQL command passed in via the @sql parameter. Now you can use the Stored procedure activity to point this newly created stored procedure and pass in a SQL command as shown here:

This activity will now execute the SQL command stated in the parameter value. Using this approach helps keep things consistent across your pipelines for SQL commands as they all use the Stored procedure activity, and the only difference is the SQL command used in the parameters section.

You can also add dynamic content to dynamically build up the SQL command that is passed into the stored procedure, this can allow you to use the same configuration for multiple Stored procedure activities but with different outcomes. For example if you had pipeline variables, the above TRUNCATE TABLE command could be replaced with @concat(‘TRUNCATE TABLE ‘,variables(‘SchemaName’),’.’,variables(‘TableName’)) and this would truncate a different table per pipeline based on the variable values.

Note: there is a limit on the number of characters the parameter value can take, so if you have a large SQL command it is best to create a stored procedure specifically for that command and to reference that stored procedure in the activity.

Pretty simple in the end, I hope this helps!

How to resolve SSIS “Row yielded no match during lookup” error.

Have you ever been faced with the SSIS error “Row yielded no match during lookup”? If so, this blog is for you!

A customer of ours recently faced the same issue in one of their SSIS packages and asked us to investigate this for them. Initial investigations on their side highlighted that when they replicated the lookup component using a standard join in T-SQL (similar to the image below) it returned the expected results.

So why was SSIS reporting an error and ultimately causing the package to fail?

To demonstrate why they faced this problem, I will provide a worked example. It should be noted that the customer example we were presented with was many thousands of rows, so the problem was much less obvious than the one I have created today.

To begin with, I have created 3 tables in my demo database using a subset of Adventure Works data.

These are:

“example.Product” which contains 10 rows and 4 columns. Essentially this could be thought of as my Product Dimension.

“example.Sales” which contains 40 rows and 4 columns. Essentially this could be thought of as my Staging Fact Table.

And finally, “example.FactSales” which contains 40 rows and 4 columns and is the result of the lookup component and ultimately would form the basis of my Sales Fact Table.

Here you will notice that “ProductNumber” has been replaced with “ProductKey”. Behind the scenes in my example SSIS package I am looking up “ProductNumber” from the “example.Product” table to return “ProductKey” in my Sales Fact Table.

*Note: I have specified for the component to fail if it finds any none matching rows.

Below is the data flow task within my example SSIS package.

This package executes without failure and populates my “example.FactSales”. However, I will now replicate the issue our customer in question faced.

To begin I executed the below T-SQL update statement against my “example.Product” table.

This changed the “ProductNumber” from “AR-5381” to “ar-5381” where “ProductKey” was equal to 1. When performing a simple “SELECT TOP 100 * FROM example.Product” we can now see that “ProductNumber” is lower case where “ProductKey” = 1 and the rest of the Product Numbers remain unchanged.

Now when I replicate the existing SSIS package using a standard T-SQL join as before there appears to be no issue – 40 rows are still returned using an inner join.

However, if I now execute the SSIS package the error “Row yielded no match during lookup” is subsequently returned.

So why is this happening?

Quite simply SSIS is case sensitive, accent sensitive and kana sensitive (at least in full cache mode) whereas SQL Server is not.

Now this is a fairly obvious example, but with more complex lookups which may involve multiple columns, it becomes a little more tricky to spot!

To solve this customer example, I used the EXACT function in MS Excel to spot the exact row in the lookup table which was returning the error. The EXACT function compares two text strings and returns “TRUE” if they are exactly the same and “FALSE” if they are not.

Alternatively, you could use SQL Server itself to replicate the lookup behavior of SSIS. Taking the T-SQL script from before, you can alter the existing inner join to a left join and collate using “Latin1_General_BIN” to identify the potential root cause of the lookup failure.

This highlights the importance of cleaning and standardising your lookup columns before using an SSIS lookup data flow component. Here at Purple Frog we tend to use a scalar function to create clean derived columns which are subsequently used for lookup components.

I hope this helps you solve any “Row yielded no match during lookup” errors with your SSIS solution!

Automate changing SSIS connection at runtime

Recently a customer came to us with an issue: “We have a primary/secondary AlwaysOn failover cluster, we want the data warehouse ETL to always pull from the secondary read-only replica. When the primary fails over, how can we detect this and repoint the ETL to the new secondary?”

This led us to the following qualification steps:
1. Is the secondary server accessible?
a. Yes – Use it.
b. No – Try the primary server.
2. Is the primary server accessible?
a. Yes – Use it.
b. No – Fail SQL Agent Job.

This sounds simple enough, but how do we handle this in a SQL Agent Job? Using the built-in SSISDB Store Procedures! (Links at the end for further reading on the Stored Procedures used).

Step 0: Ensure that your Connection has a parameterised option for the ServerName

Step 1: Set the SQL connection manager to the SECONDARY server. (This is done to always try this server first).

DECLARE @var sql_variant = N’1.1.1.1′ –Replace 1.1.1.1 with your SECONDARY destination IP/Servername
EXEC [SSISDB].[catalog].[set_object_parameter_value]
@object_type=20,
@parameter_name=N’CM.XXX.ServerName’, –Replace XXX with the name of your connection
@object_name=N’XXX’, –Replace XXX with the name of your SSIS Package or Project for the level of connection
@folder_name=N’XXX’, –Replace XXX with the name of the folder your SSIS Project is deployed into
@project_name=N’XXX’, –Replace XXX with the name of your SSIS Project
@value_type=V,
@parameter_value=@var
GO

Step 2: Check the connection can access the database. (This can be any query with a pass/fail. In this instance, we want to check whether the database is a writeable replica, but this could be a simple query that will fail if not available).

IF EXISTS(SELECT 1 FROM master.sys.databases WHERE [name] = ‘XXX’) –Replace XXX with the database name
BEGIN
IF NOT (DATABASEPROPERTYEX(‘XXX’, ‘Updateability’) <> ‘READ_WRITE’) –Replace XXX with the database name
THROW 51000, ‘This is not a writeable replica’, 1;
END

Step 3: Set the SQL connection manager to the PRIMARY server.

DECLARE @var sql_variant = N’1.1.1.1′ –Replace 1.1.1.1 with your PRIMARY destination IP/Servername
EXEC [SSISDB].[catalog].[set_object_parameter_value]
@object_type=20,
@parameter_name=N’CM.XXX.ServerName’, –Replace XXX with the name of your connection
@object_name=N’XXX’, –Replace XXX with the name of your SSIS Package or Project for the level of connection
@folder_name=N’XXX’, –Replace XXX with the name of the folder your SSIS Project is deployed into
@project_name=N’XXX’, –Replace XXX with the name of your SSIS Project
@value_type=V,
@parameter_value=@var
GO

Step 4: Check the connection can access the database. (Again this primary server should now be a writeable replica in our case).

IF EXISTS(SELECT 1 FROM master.sys.databases WHERE [name] = ‘XXX’) –Replace XXX with the database name
BEGIN
IF NOT (DATABASEPROPERTYEX(‘XXX’, ‘Updateability’) <> ‘READ_WRITE’) –Replace XXX with the database name
THROW 51000, ‘This is not a writeable replica’, 1;
END

Step 5: Email to say primary server used for information.

DECLARE @EmailBody NVARCHAR(1024)
SET @EmailBody = CONVERT(varchar(10), GETDATE(), 103) + ‘ – the ETL was pointed to the primary version when the server check was run’

EXEC msdb.dbo.sp_send_dbmail
@profile_name = ‘XXX’, –Replace XXX with the name of your SQL mail Profile
@recipients = ‘XXX’, –Replace XXX with the list of email addresses you’d like to notify
@body = @EmailBody,
@subject = ‘XXX’ ; –Replace XXX with the desired email title

Step 6: Continue SQL Agent job as normal

The flow for this would then be as follows:

STEP 1
Success – Go to Step 2 (Connection set successfully, so test the connection).
Failure – Fail Job

STEP 2
Success – Go to Step 6 (Successfully used the secondary replica, so continue with the SQL Agent Job).
Failure – Go to Step 3 (Unable to use the replica, so try the primary).

STEP 3
Success – Go to Step 4 (Connection set successfully, so test the connection).
Failure – Fail Job

STEP 4
Success – Go to Step 5 (Successfully used the primary, so send notification email).
Failure – Fail Job

STEP 5
Success – Go to Step 6 (Continue SQL Agent Job).
Failure – Go to Step 6 (Because if the email notification fails, we still want to run the rest of the SQL Agent Job).

Microsoft Docs for: set_object_parameter_value

Microsoft Docs for: sp_send_dbmail

Any questions? Let me know in the comments below or contact me on twitter: @PurpleFrogReiss

System OutOfMemoryException when deploying SSIS .ispac

If you’ve tried deploying a .ispac using the Integration Services Deployment Wizard, you may have come across the following error:

By default SQL Server uses the 32-bit version of the Deployment Wizard, which is why you’ll be getting the Out of Memory error. If you instead launch the Wizard from the 64-bit location, (default is C:\Program Files\Microsoft SQL Server\130\DTS\Binn\ISDeploymentWizard.exe), then you should be able to deploy with no problem. (Please note, the “130” part of the above string may be different, depending on the version of SQL you have installed. This may reside in one of the other numbered folders in that location).

Using the same server and same project, the 32-bit deployment took over an hour and then failed with the error message, while the 64-bit took under a minute and deployed without issue!

Lookups and If Conditions in Azure Data Factory v2 (ADFv2)

Azure Data Factory v2 (ADFv2) has some significant improvements over v1, and we now consider ADF as a viable platform for most of our cloud based projects. But things aren’t always as straightforward as they could be. I’m sure this will improve over time, but don’t let that stop you from getting started now.

This post provides a walk through of using the ‘Lookup’ and ‘If Condition’ activities to do some basic conditional logic depending on the results of a database query.

Assumptions: You already have an ADF pipeline created. If you want to hook into SSIS then you’ll also need the SSIS Integration Runtime set up – although this is not relevant just for the if condition.

Scene setting: We want to execute an SSIS package, but only if the package is enabled in a config table, and in a ‘Waiting’ state. We can determine this state by calling a stored proc in a database.

Solution: Use a ‘Lookup’ activity to call the stored proc, with the single row output saved to a Dataset. Then an ‘If Condition’ activity to check the results of an expression, followed by whatever activities you want to call when the expression equates to True or False. the overall pipeline will look something like this.

Now we’ll look at some of the configuration of the activities. Before we create the pipeline we need to add a Dataset (actually a data source), pointing to the Azure SQL DB that holds our stored proc. Click the + to add a Factory Resource, and select Dataset, followed by Azure SQL Database.

In the ‘Connection’ tab, click ‘+ New’ to create a Linked Service, and enter the connection string details to your Azure SQL DB. You don’t need to select a table, as this dataset will just be for calling a stored proc, so leave it unselected. I’ve called my dataset ‘ComponentState’.

Now add a pipeline by clicking the + in the Factory Resources pane but this time selecting ‘Pipeline’.

Firstly add a Lookup activity, and set its Source Dataset to the one you’ve just created, and then select the stored proc from the dropdown list. Or you could write a query here instead. If you’ve selected a Stored Proc, you can click on ‘Import Parameter’ to pre-populate the parameters into the section below.

Note that the ‘First row only’ checkbox is selected by default. This is correct if you only want a single result row. If you want the full resultset then untick this. (Note that the remaining steps here would require different expressions).

Then add the ‘If Condition’ activity, and link the two by dragging the green box from Lookup onto the If Condition activity:


In the Settings tab, we need to specify the condition expression to use. We want to compare the ‘ComponentState’ field of the proc to the string ‘Waiting’. So we use the equals() function, although we could also use ==.

The expression is:

@{equals(activity(‘Lookup1′).output.firstRow.ComponentState,’Waiting’)}

The @{ } denotes an expression that will be evaluated. Anything outside these brackets will not get evaluated, just left as a string. The first parameter to the equals function is “activity(‘Lookup1’).output.firstRow.ComponentState“, or the ComponentState field of the first row of the output of the Lookup1 activity. We then compare this to the hard coded string ‘Waiting’.

This will either return True or False, the outcomes of which we can configure in the ‘Activities’ tab,

In the ‘If true Activities’ section, just add the ADFv2 activities that you want to happen if your expression evaluates to True, and the same for False. Note that when you edit this, you get a complete canvas to add as many activities you want, I’ve just added one here, to call an SSIS package.

Hope that’s been useful.

<Frog-Blog Out>

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Lewis Prince
Reiss McSporran
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon