Data Factory

  • Microsoft Fabric Mirroring: Quick Start Guide

    Mirroring provides a modern approach to accessing and ingesting data continuously and seamlessly from any database or data warehouse into the Data Warehousing experience within Microsoft Fabric. Here are the key points about Mirroring: Near Real-Time Data Access: Mirroring allows users to access changes in the source data almost instantly. It ensures that the data in Fabric’s OneLake remains up-to-date without complex setup or manual…

    » Read more
  • New Microsoft Fabric Features – A SQLBits and FabCon Round Up!

    March was a big month for Microsoft Fabric and its users as there were two major conferences discussing all aspects of Microsoft Fabric, from functionality to security and governance. SQLBits, a community conference, came first with a wealth of knowledge from experts using Microsoft Fabric from a beginner to experienced level. FabCon immediately followed, which was a conference from Microsoft to announce new features and to hear from those…

    » Read more
  • Synapse Script Activity Error “Argument {0} is null or empty.\r\nParameter name: paraKey”

    Came across this issue when trying to get a range of values that are calculated dynamically at runtime. Using the script activity in ADF, I wanted a result set that I could reference as parameters for a notebook later in the pipeline but was experiencing an error I had not seen before referencing null arguments. The Problem The script is simple – get a few key dates from the last few months and set these as variables. Finally, return these…

    » Read more
  • Synapse Copy Activity Fails Over Certain File Size – ADF

    Copy Activity Issue in ADF / Synapse Analytics Recently, when trying to copy a .csv file from an FTP source to a Azure Data Lake using a Copy Activity in Azure Synapse, I had an issue where files > 16MB in size would fail. To overcome this, I took the first 13k rows and created another file from this, which resulted in a 4MB file. I tested this extra small file and it worked in the copy activity no issues. I multiplied these same 13k rows out…

    » Read more
  • What is Microsoft Fabric? (Power BI + Synapse + DW + DataLake + ML)

    What Is Microsoft Fabric

    At today’s Build conference, Microsoft announced Fabric. What is this? In simple terms, think of taking Synapse Analytics, Data Warehousing, Data Lakes, Data Factory, Spark Notebooks and Machine Learning, and bring them all together into Power BI. This is underpinned by Microsoft OneLake, a high performance scalable data lake storage layer, supporting all of the above. OneLake is, as the name implies, one data lake that can be used across…

    » Read more
  • Azure Data Factory Pricing – How much is my pipeline actually costing me?

    Has a client ever asked you how much it actually costs to run a single pipeline in Azure Data Factory? Have you ever thought ADF pricing is just a black box? Well, hopefully my latest blog post will give you an indication on how you can start calculating the cost of a pipeline run! I will base my analysis on a sample pipeline containing the following activities as shown below: 1 x Lookup Activity (Pipeline Activity) 1 x Copy Data Activity (Data…

    » Read more
  • ADF Data flow string split

    This is a quick blog showing how to do a string split to get particular items in ADF data flows. Consider the following data where names and colours are combined into the FullName and Colours columns respectively. Note the delimiter for FullName is a space and the delimiter for Colours is a comma. To get each individual item and create new columns for this data use the split function in a Derived column transformation. The syntax for this…

    » Read more
  • ADF breaking out of a ForEach activity

    Currently, the ForEach activity in ADF (or Synapse) does not break out when an inner activity fails. There are several ways you can force the ForEach to break out, the most common is to cancel the pipeline run when an inner activity fails and there are already many blogs out there that cover this. My requirement is slightly different and that is what I will show here. Consider a scenario where you have a ForEach activity which has multiple inner…

    » Read more
  • ADF Dataflow CTE workaround

    At the time of writing, it is not possible to write a query using a CTE in the source of a dataflow. However, there are a few options to deal with this limitation: re-write the query using subqueries instead of CTEs use a stored procedure that contains the query and reference the stored proc in the source of the dataflow write the query as a view and reference the view in the source of the dataflow (this is my preferred method and the one I will…

    » Read more
  • How to parameterise the Execute Pipeline activity in Azure Synapse Analytics.

    Unfortunately, as of April 2022 there is not an option to parameterise or add dynamic content to an “Execute Pipeline” activity to invoke a pipeline run. However, with the use of a Microsoft API there is method which we can use to overcome this. In this blog post we’ll use the “Pipeline – Create Pipeline Run” API documented below: https://docs.microsoft.com/en-us/rest/api/synapse/data-plane/pipeline/create-pipeline-run…

    » Read more