Calling U-SQL Stored Procedures with C# Code Behind

So friends, some more lessons learnt when developing with U-SQL and Azure Data Lake. I’ll try and keep this short.

Problem

You have a U-SQL stored procedure written and working fine within your Azure Data Lake Analytics service. But we need to add some more business logic or something requiring a little C# magic. This is the main thing I love about U-SQL, having that C# code behind file where I can extend my normal SQL behaviour. So, being a happy little developer you write your class and method to support the U-SQL above and you recreate your stored procedure. Great!

Next, you try to run that stored procedure…

[ExampleDatabase].[dbo].[SimpleProc]();

But are hit with an error, similar to this:

E_CSC_USER_INVALIDCSHARP: C# error CS0103: The name ‘SomeNameSpaceForCodeBehind’ does not exist in the current context.


Why?

Submitting U-SQL queries containing C# code behind methods works fine normally. But once you wrap it up as a stored procedure within the ADL analytics database the complied C# is lost. Almost as if the U-SQL file/procedure no longer has its lovely code behind file at all!

Just to be explicit with the issue. Here is an example stored procedure that I’ve modified from the Visual Studio U-SQL Sample Application project. Note my GetHelloWord method that I’ve added just for demonstration purposes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
DROP PROCEDURE IF EXISTS [dbo].[SimpleProc];
 
CREATE PROCEDURE [dbo].[SimpleProc]()
AS
BEGIN
 
    @searchlog =
        EXTRACT UserId INT,
                START DateTime,
                Region string,
                Query string,
                Duration INT?,
                Urls string,
                ClickedUrls string
        FROM "/Samples/Data/SearchLog.tsv"
        USING Extractors.Tsv();
 
    @WithCodeBehind =
        SELECT 
            *,
            SomeNameSpaceForCodeBehind.MyCodeBehind.GetHelloWorld() AS SomeText
        FROM @searchlog;
 
    OUTPUT @WithCodeBehind
    TO "/output/SearchLogResult1.csv"
    USING Outputters.Csv();
 
END;

This U-SQL file then has the following C#, with my totally original naming conventions. No trolls please, this is not the point of this post 🙂

1
2
3
4
5
6
7
8
9
10
11
namespace SomeNameSpaceForCodeBehind
{
    public class MyCodeBehind
    {
        static public string GetHelloWorld()
        {
            string text = "HelloWorld";
            return text;
        }
    }
}

So, this is what doesn’t work. Problem hopefully clearly defined.

Solution

To work around this problem, instead of using a C# code behind file for the procedure we need to move the class into its own assembly. This requires a little more effort and plumbing, but does solve this problem. Plus, this approach is probably more familiar to people that have ever worked with CLR functions in SQL Sever that they want to use within a stored procedure.

This is what we need to do.

  • Add a C# class library to your Visual Studio solution and move the U-SQL code behind into a library name space.

  • Build the library and use the DLL to create an assembly within the ADL analytics database. The DLL can live in your ADL store root, in line it or create it from Azure Blob Store. I have another post on that here if your interested.
CREATE ASSEMBLY IF NOT EXISTS [HelloWorld] FROM "assembly/ClassLibrary1.dll";
  • Finally, modify your stored procedure to use the assembly instead of the code behind name space. The new stored procedure should look like this.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
DROP PROCEDURE IF EXISTS [dbo].[SimpleProc];
 
CREATE PROCEDURE [dbo].[SimpleProc]()
AS
BEGIN
 
    //Complied library:
    REFERENCE ASSEMBLY [HelloWorld];
 
    @searchlog =
        EXTRACT UserId INT,
                START DateTime,
                Region string,
                Query string,
                Duration INT?,
                Urls string,
                ClickedUrls string
        FROM "/Samples/Data/SearchLog.tsv"
        USING Extractors.Tsv();
 
    @WithCodeBehind =
        SELECT *,
               //Changed TO USE assembly:
               HelloWorld.ClassLibrary1.GetHelloWorld() AS SomeText
        FROM @searchlog;
 
    OUTPUT @WithCodeBehind
    TO "/output/SearchLogResult1.csv"
    USING Outputters.Csv();
 
END;

This new procedure executes without error and gets around the problem above.

I hope this helps and allows you to convert those complex U-SQL scripts to procedures, while retaining any valuable code behind functionality.

Many thanks for reading

2 Responses to Calling U-SQL Stored Procedures with C# Code Behind

  • What if I just want to call a function in my assembly to get say a file input path or output path (string) ?

    • Hi Graham, great question. For that you probably want to consider some higher level management task that first resolves the path of the file first. Then either dynamically generates the USQL to include the path. Or maybe have a U-SQL procedure with a parameter where you pass in the path. Either way it would require a 2 step process. Getting the U-SQL code behind to do this first would be a chicken/egg situation. Unless you consider writing yourself a custom extractor. Mmmm. Thinking about it more. I wonder if U-SQL would allow the setting of a local variable as the file path from the result of a code behind method. I think you’ve given me an idea for my next blog post 🙂 Cheers Paul

Leave a Reply to Paul Andrew Cancel reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

3,604 Spambots Blocked by Simple Comments

Paul’s Frog Blog

Paul is a Microsoft Data Platform MVP with 10+ years’ experience working with the complete on premises SQL Server stack in a variety of roles and industries. Now as the Business Intelligence Consultant at Purple Frog Systems has turned his keyboard to big data solutions in the Microsoft cloud. Specialising in Azure Data Lake Analytics, Azure Data Factory, Azure Stream Analytics, Event Hubs and IoT. Paul is also a STEM Ambassador for the networking education in schools’ programme, PASS chapter leader for the Microsoft Data Platform Group – Birmingham, SQL Bits, SQL Relay, SQL Saturday speaker and helper. Currently the Stack Overflow top user for Azure Data Factory. As well as very active member of the technical community.
Thanks for visiting.
@mrpaulandrew