I recently came across some unexpected output in a dedicated SQL Pool in Azure Synapse Analytics, upon investigation I realised this is as a result of an issue in Data Flows. This blog shows the issue using dummy data and a temporary workaround. At the time of writing (01/03/2024), this is still an issue and has been raised with Microsoft. I will provide a further update once Microsoft have resolved the issue or provided a suitable fix.

Below is the setup for the Data Flow, which includes a basic source query, a Lookup transformation, a Derived Column transformation to fix null values, and a sink for the final output.

The following image shows the source query and the output:

At this point we have two columns with values, and we want to do a look-up to get another value (usually an ID or key) based on the data we already have. The data for the lookup table is shown below.

Below is the setup for the Lookup transformation, it is a simple match between the relevant columns in SourceData and LookupSource.

The Derived Column transformation called FixNulls simply has the expression iif(isNull(LookupKey), -1, LookupKey) to handle null values for the LookupKey where there is no match in the Lookup transformation.

In the data preview in the Lookup transformation the values for LookupKey are as expected, and it’s the same in the Sink data preview:

However, if I run the pipeline and the output for LookupKey does not match the data preview shown above. The first row shown below is correct, with LookupKey 1, but the second row should have LookupKey 2 and this is what is shown in the data preview in Synapse. But the actual value output is -1, which is the result of the expression in FixNulls.

This behaviour has been raised with Microsoft, I’d expect the data preview to always match the final output. As shown here the issue only happens when one of the lookup columns is blank, the record with values in both lookup columns works as expected. One way to get round this issue is to change the blanks to a hardcoded value, I’ve used a space: ‘ ‘, and when I re-run the pipeline I now get this:

If you’ve come across the same issue then hopefully this helps you with a temporary fix, but as mentioned earlier I’ll provide an update when I have more information from Microsoft.

Tags: , ,