Let’s try and keep this post short and sweet. Diving right in imagine a scenario where we have an Azure Data Factory (ADF) pipeline that includes activities to perform U-SQL jobs in Azure Data Lake (ADL) Analytics. We want to control the U-SQL by passing the ADF time slice value to the script, hopefully a fairly common use case. This isn’t yet that intuitive when constructing the ADF JSON activity so I hope this post will save you some debugging time.

For my example I’ve created a stored procedure in my ADL Analytics database that accepts a parameter @TimeSliceStart as a string value in the format yyyyMMdd.

[ExampleDatabase].[dbo].[usp_DoSomeStuff](@TimeSliceStart);

This doesn’t have to be a stored procedure. ADF is also happy if you give if U-SQL files or even just inline the entire script. Regardless, the ADF parameter handling is the same.

In my ADF JSON activity I then have the following;

 
{
"name": "DataLakeJob1",
"properties": {
"description": "Run USQL with timeslice param",
"activities": [
{
"type": "DataLakeAnalyticsU-SQL",
"typeProperties": {
	"script": "[ExampleDatabase].[dbo].[usp_DoSomeStuff](@TimeSliceStart);",
	"degreeOfParallelism": 5,
	"priority": 1,
	"parameters": {
		"TimeSliceStart": "$$Text.Format('{0:yyyyMMdd}', Time.AddMinutes(SliceStart, 0))"
	}
} // etc ... } ] // etc ...}}}

Notice in the extended properties we have a parameters attribute that can include children of the actual variables we want to pass to the U-SQL script.

Here’s the important things to understand about this ADF parameters attribute.

  • The name of the parameter must match the name of the variable expected by U-SQL exactly.
  • As you would expect the data types of the expected variable and JSON parameter must match.
  • It is perfectly acceptable to have multiple parameters in the ADF JSON and written in any order.

So how does this work?…

What ADF does when calling ADL is take the parameters listed in the  JSON and write out a bunch of U-SQL ‘DECLARE @Variable’ lines. These then get appended to the top of the actual U-SQL script before giving it to ADL as a job to run. You can see this if you go into the ADL Analytics blades in the Azure portal, select the job created by the ADF activity then choose Duplicate Script. This reveals the actual U-SQL used in the job.

Here’s the proof.

 

 

 

 

 

 

 

 

 

 

Then…

Just knowing what ADF does when converting the JSON parameters to U-SQL declared variables is the main take away here.

That’s it! I promised short and sweet.

Many thanks for reading.


Tags: , , ,