Reference Guide

Azure Data Lake – The Services. The U-SQL. The C# (Reference Guide)

This post is a reference guide to support an event talk or webinar. The content is intended to assist the audience only. Thank you.

Abstract

How do we implement Azure Data Lake? How does a lake fit into our data platform architecture? Is Data Lake going to run in isolation or be part of a larger pipeline? How do we use and work with USQL? Does size matter?! The answers to all these questions and more in this session as we immerse ourselves in the lake, that’s in a cloud. We’ll take an end to end look at the components and understand why the compute and storage are separate services. For the developers, what tools should we be using and where should we deploy our USQL scripts. Also, what options are available for handling our C# code behind and supporting assemblies. We’ll cover everything you need to know to get started developing data solutions with Azure Data Lake. Finally, let’s extend the U-SQL capabilities with the Microsoft Cognitive Services!

Links

What is Azure Data Lake? The Microsoft version.
https://azure.microsoft.com/en-gb/solutions/data-lake/

Understanding the ADL Analytics Unit
https://blogs.msdn.microsoft.com/azuredatalake/2016/10/12/understanding-adl-analytics-unit/

Why use Azure Data Lake? The Microsoft version.
https://azure.microsoft.com/en-gb/solutions/data-lake/

Comsuming Data Lake with Power – Cross tenant data refreshes.
https://www.purplefrogsystems.com/paul/2017/06/connecting-power-bi-to-azure-data-lake-store-across-tenants/

U-SQL String Data Type 128KB Limit
https://feedback.azure.com/forums/327234-data-lake/suggestions/13416093-usql-string-data-type-has-a-size-limit-of-128kb

Creating a U-SQL Merge Statement
https://www.purplefrogsystems.com/paul/2016/12/writing-a-u-sql-merge-statement/

U-SQL Looping
https://www.purplefrogsystems.com/paul/2017/05/recursive-u-sql-with-powershell-u-sql-looping/

U-SQL Date Dimension
https://www.purplefrogsystems.com/paul/2017/02/creating-a-u-sql-date-dimension-numbers-table-in-azure-data-lake/

Further Reading

Microsoft Blog – An Introduction to U-SQL in Azure Data Lake
https://blogs.msdn.microsoft.com/robinlester/2016/01/04/an-introduction-to-u-sql-in-azure-data-lake/

Microsoft Documentation – U-SQL Programmability Guide
https://docs.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-u-sql-programmability-guide

Microsoft MSDN – U-SQL Language Reference
https://msdn.microsoft.com/en-US/library/azure/mt591959(Azure.100).aspx

SQL Server Central – Stairway to U-SQL
http://www.sqlservercentral.com/stairway/142480/

Stack Overflow – U-SQL Tag
http://stackoverflow.com/questions/tagged/u-sql

 

Cognitive services with U-SQL in Azure Data Lake

https://docs.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-u-sql-cognitive


Cognitive Services with U-SQL (Reference Guide)

This post is a reference guide to support an event talk or webinar. The content is intended to assist the audience only. Thank you.

Abstract

Microsoft’s Cognitive Services are basically the best thing since sliced bread, especially for anybody working with data. Artificial intelligence just got packaged and made available for the masses to download. In this short talk, I’ll take you on a whirl wind tour of how to use these massively powerful libraries directly in Azure Data Lake with that offspring of T-SQL and C# … U-SQL. How do you get hold of the DLL’s and how can you wire them up for yourself?… Everything will be revealed as well as the chance to see what the machines make of the audience!

Links

Helpful Bits

Why U-SQL?

  • U for unified. Unifying T-SQL and C#.
  • U is the next letter after T. T-SQL > U-SQL.
  • U for U-Boat, because Mike Rys dives into his Data Lake with a U-Boat 🙂

Installing the U-SQL samples and extension files in your Data Lake Storage.

The executed code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
USE [CognitiveServices];
 
REFERENCE ASSEMBLY ImageCommon;
REFERENCE ASSEMBLY FaceSdk;
REFERENCE ASSEMBLY ImageEmotion;
REFERENCE ASSEMBLY ImageTagging;
REFERENCE ASSEMBLY ImageOcr;
 
--Extract the number of objects on each image and tag them 
@imgs =
    EXTRACT 
        FileName string, 
        [ImgData] byte[]
    FROM 
        @"/Images/{FileName}.jpg"
    USING 
        NEW Cognition.Vision.ImageExtractor();
 
@imgTags =
    PROCESS 
        @imgs 
    PRODUCE 
        [FileName],
        [NumObjects] INT,
        [Tags] string
    READONLY 
        [FileName]
    USING 
        NEW Cognition.Vision.ImageTagger();
 
OUTPUT @imgTags
TO "/output/ImageTags.csv"
USING Outputters.Csv(quoting : TRUE, outputHeader : TRUE);

 

Paul’s Frog Blog

Paul is a Microsoft Data Platform MVP with 10+ years’ experience working with the complete on premises SQL Server stack in a variety of roles and industries. Now as the Business Intelligence Consultant at Purple Frog Systems has turned his keyboard to big data solutions in the Microsoft cloud. Specialising in Azure Data Lake Analytics, Azure Data Factory, Azure Stream Analytics, Event Hubs and IoT. Paul is also a STEM Ambassador for the networking education in schools’ programme, PASS chapter leader for the Microsoft Data Platform Group – Birmingham, SQL Bits, SQL Relay, SQL Saturday speaker and helper. Currently the Stack Overflow top user for Azure Data Factory. As well as very active member of the technical community.
Thanks for visiting.
@mrpaulandrew