Tag Archive: Azure Data Lake

  • Synapse Copy Activity Fails Over Certain File Size – ADF

    Copy Activity Issue in ADF / Synapse Analytics Recently, when trying to copy a .csv file from an FTP source to a Azure Data Lake using a Copy Activity in Azure Synapse, I had an issue where files > 16MB in size would fail. To overcome this, I took the first 13k rows and created another file from this, which resulted in a 4MB file. I tested this extra small file and it worked in the copy activity no issues. I multiplied these same 13k rows out…

    » Read more
  • Business Intelligence in Azure – SQLBits 2018 Precon

    What can you expect from my SQLBits pre conference training day in February 2018 at the London Olympia? Well my friends, in short, we are going to take whirlwind tour of the entire business intelligence stack of services in Azure. No stone will be left unturned. No service will be left without scalability. We’ll cover them all and we certainly aren’t going to check with the Azure bill payer before turning up the compute on our data…

    » Read more
  • Azure Data Lake – The Services. The U-SQL. The C# (Reference Guide)

    This post is a reference guide to support an event talk or webinar. The content is intended to assist the audience only. Thank you. Abstract How do we implement Azure Data Lake? How does a lake fit into our data platform architecture? Is Data Lake going to run in isolation or be part of a larger pipeline? How do we use and work with USQL? Does size matter?! The answers to all these questions and more in this session as we immerse ourselves in…

    » Read more
  • Connecting PowerBI.com to Azure Data Lake Store – Across Tenants

    Welcome readers, this is a post to define a problem that shouldn’t exist. But sadly, does exist and given its relative complexity I think warrants some explanation. Plus, I’ve included details of what you can currently do if you encounter it. First some background… Power BI Desktop With the recent update to the Power BI desktop application we now find the Azure Data Lake Store connector has finally relinquished its ‘(Beta)’ status and…

    » Read more
  • Recursive U-SQL With PowerShell (U-SQL Looping)

    In its natural form U-SQL does not support recursive operations and for good reason. This is a big data, scale out, declarative language where the inclusion of procedural, iterative code would be very unnatural. That said, if you must pervert things PowerShell can assist with the looping and dare I say the possibility for dynamic U-SQL. A couple of caveats… From the outset, I accept this abstraction with PowerShell to achieve some…

    » Read more
  • Calling U-SQL Stored Procedures with C# Code Behind

    So friends, some more lessons learnt when developing with U-SQL and Azure Data Lake. I’ll try and keep this short. Problem You have a U-SQL stored procedure written and working fine within your Azure Data Lake Analytics service. But we need to add some more business logic or something requiring a little C# magic. This is the main thing I love about U-SQL, having that C# code behind file where I can extend my normal SQL behaviour. So,…

    » Read more
  • Storing U-SQL Assemblies in Azure Blob Storage

    I’m hoping the title of this post is fairly self explanatory. Your here because like me you found that the MSDN language reference page for creating U-SQL assemblies states that it’s possible to store the DLL’s in Azure Blob Storage. But it doesn’t actually tell you how. Well please continue my friends and I’ll show you how. The offending article: https://msdn.microsoft.com/en-us/library/azure/mt763293.aspx The…

    » Read more
  • Creating a U-SQL Date Dimension & Numbers Table in Azure Data Lake

    Now we all know what a date dimension is and there are plenty of really great examples out there for creating them in various languages. Well, here’s my U-SQL version creating the output from scratch using a numbers table. Remember that U-SQL needs to be handled slightly differently because we don’t have any iterative functionality available. Plus its ability to massively parallelise jobs means we can’t write something that…

    » Read more
  • Passing Parameters to U-SQL from Azure Data Factory

    Let’s try and keep this post short and sweet. Diving right in imagine a scenario where we have an Azure Data Factory (ADF) pipeline that includes activities to perform U-SQL jobs in Azure Data Lake (ADL) Analytics. We want to control the U-SQL by passing the ADF time slice value to the script, hopefully a fairly common use case. This isn’t yet that intuitive when constructing the ADF JSON activity so I hope this post will save…

    » Read more
  • Writing a U-SQL Merge Statement

    Unlike T-SQL, U-SQL does not currently support MERGE statements. Our friend that we have come to know and love since its introduction in SQL Server 2008. Not only that, but U-SQL also doesn’t currently support UPDATE statements either… I know… Open mouth emoji required! This immediately leads to the problem of change detection in our data and how, for example, we should handle the ingestion of a daily rolling 28-day TSV extract,…

    » Read more