0845 643 64 63

Monthly Archives: May 2022

Deploy Azure AutoML Model And Consume In Excel

It has come back to my turn to write a blog post, and if you remember my previous one concerned why you should use Azure based AutoMl and subsequently how to do so. If you followed that then you will be left with a model of which you’ve scored and know the performance of, but no way of how to then deploy and use your model. I will outline the steps needed to do this (which involves a major shortcut as we are using an AutoMl model), and then show you the required VBA needed to consume this in Microsoft Excel.

Create an endpoint

The first step is to navigate to your selected model in Azure Machine Learning Studio, click the dropdown on the deploy tab and select deploy to a web service.

You will then be met by the below screen which is where we really benefit from using an AutoMl model! If this was not an AutoMl model, we would have two other fields here asking for an entry script file and a Conda dependencies file. The point of these being to allow whatever is calling the model to know how to use it and also what packages it needs to run it. Luckily, AutoMl has already done this for you, so all you need to do is give the deployment a name (no spaces, special characters or spaces allowed) and then for ease of use go for an Azure Container Instance for Compute type. Then click deploy!

Then navigate to the ribbon on the left of your screen and select endpoints to see your end point. Open this and you will see it will have a deployment state of ‘Transitioning’ for a while and when this is ‘Healthy’ you are ready to go. When ready, you will see a few tabs at the top of the page (under the deployments name) the main ones are ‘Test’ where you can manually enter values into a JSON format to get a result, and ‘Consume’ where you can see the REST endpoint you have created as well as useful code snippets on how you can implement your model in C#, Python and R; but not Excel which will be more familiar to your average data consumer.

How to consume in Excel

Before we write any VBA we need to make a table in a spreadsheet for the data to be entered into and results to outputted to. If you go back to the ‘Test’ tab, it’s useful to look at the JSON to see what order the features will be read as they must match the order shown in the JSON.

You can then create a table which mirrors this.

You then need to write some VBA to link these elements up, so you firstly have to make a VBA module. To do this you need to open the VBA GUI which is accessed via the developer tab and clicking ‘Visual Basic’.

You then need to click the VBA project with the name of your excel workbook , then click the insert tab and select ‘Module’. This is where you will write your VBA which we will do in two sections. The first being a function which performs the HTTP request to the end point. We want it to take our Features and the REST endpoint and make a HTTP request with these and store the response. The second will be a sub that needs to create the JSON we saw in the ‘Test’ tab earlier, as this is how the endpoint needs the input data packaged, and then give this to your function to send to your model.

Function RunScore(Features, RestEndpoint)

    If WEBURL <> "" Then
        Url = RestEndpoint
        Set objHTTP = CreateObject("MSXML2.ServerXMLHTTP")
        objHTTP.Open "POST", Url, False
        objHTTP.setRequestHeader "Content-type", "application/json"
        objHTTP.send (Features)
        RunScore = objHTTP.ResponseText
    '    MsgBox (resp)
    Else
     MsgBox "Not WEB URL Provided"
    End If
    
End Function
Sub Score()
    
    On Error GoTo Error:
    
    Range("B2").Select
    NumRows = Range(Selection, Selection.End(xlDown)).Rows.Count
    
    For r = 2 To 2 + NumRows
    
    
        body = "{" & Chr(34) & "Inputs" & Chr(34) & ":{" & Chr(34) & "data" &            Chr(34) & ": [[" & Range("B" & r) & "," & Range("C" & r) & "," & Range("D" & r) & "," & Range("E" & r) & "]]}," & Chr(34) & "GlobalParameters" & Chr(34) & ": 1.0}"
    
        WEBURL = http://999def99-9e9d-999e-b99a-aa1c12820857.uksouth.azurecontainer.io/score
        
       
        
        ret = RunScore(Features, RestEndpoint)
        
        openPos = InStr(ret, "[")
        closePos = InStr(ret, "]")
        midbit = Mid(ret, openPos + 1, closePos - openPos - 1)
        midbit = Round(midbit, 2)
        
        Range("F" & r) = midbit
        
    Next r


Exit Sub
Error:
      MsgBox (Err.Description)
End Sub

Ignoring anything that mentions error (this is just a wrapper that tells you what’s gone wrong, if something does go wrong), we are selecting the first cell where there should be data and then finding out how many rows of data there are. We then loop through the rows creating the JSON for each row, this is stored in body. It is essential that you use the string exactly as it is above, only editing the cell locations to match the length and order of your data, the features must be in the order that they are shown in the JSON we looked at earlier. Along with storing the Rest Endpoint in a variable, we then call the function we made earlier, the output of which we have to do a bit of manipulation on as it will be wrapped in brackets etc. We then place the value we have extracted into the result location we created in the spreadsheet earlier.

We now need a button in the spreadsheet to allow us to run the code ‘at the click of a button’ to excuse the pun. This can be done via the developer tab in excel where you need to click insert and select a button. The button needs to be placed via a click and drag action in the spreadsheet exactly like how you insert shapes in excel. Upon releasing your mouse button a menu will pop up listing the macros you have in the spreadsheet, select ‘Score’ (or however you have named it), which will assign the macro to this button.

It is now ready to go, you can insert the features you want values for (on a row by row basis) and when you have entered the features in the table, you simply click the button we created and the model will be run and you will see the results outputted in your table!

The SQL Query Alias Conundrum – Order of Execution

Falling Dominoes

So, you have just written a query, hit execute and you have encountered an error: Invalid column name ‘[column name]‘.

The column you’ve used in your WHERE clause cannot be identified by its alias. You’ve defined it at the top of the query and used it fine previously as your ORDER BY condition, so why can’t the engine recognise it?

If you’ve ever written a SQL query, this should look familiar:

  1. SELECT
  2. DISTINCT
  3. FROM
  4. JOIN
  5. ON
  6. WHERE
  7. GROUP BY
  8. HAVING
  9. ORDER BY

It makes sense that we read the query from top to bottom / left to right (especially for those in the West), but contrary to how it may look, this isn’t how the query is executed. The order of execution is:

  1. FROM / JOIN
  2. WHERE / ON
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. DISTINCT
  7. ORDER BY

When you stop and think about what is happening, logically, this order makes sense.

FROM / JOIN – First and foremost, we need to determine the location of the data set that we are querying.

WHERE / ON – We need to narrow that data set down with predicates to keep only the records we care about.

GROUP BY – If you need to aggregate the results, this is done next to give us an even more concise result set.

HAVING – WHERE 2.0! This is a predicate over the aggregated result set.

SELECT – Okay, now we have our final result set, containing ONLY our required rows. Now, finally, we can select which columns we want from that dataset.

DISTINCT – Now we know which columns we want from the result set, we can strip out any duplicates.

ORDER BY – Finally, we have whittled the result set down to only the rows and columns we need, so we can order it however we want.

Question Mark

What is the value in knowing this?

It’s important to understand the difference between the logical and lexical order of a query. Knowing this will help you troubleshoot any syntax errors you may encounter.

Going back to our original issue of using an alias in a WHERE clause, you can now see why you get an error about an invalid column.

Look back at the order of execution, aliases are assigned to columns during the SELECT stage, WHERE comes before SELECT. The engine is trying to process the column by its alias before it’s been assigned. ORDER BY comes after SELECT so the same issue doesn’t arise here

It also explains why Intellisense doesn’t autocomplete or suggest column names until you’ve added the FROM statement. Until you tell the engine where the data is coming FROM, it doesn’t know which columns it can SELECT.

ADF Dataflow CTE workaround

At the time of writing, it is not possible to write a query using a CTE in the source of a dataflow. However, there are a few options to deal with this limitation:

  • re-write the query using subqueries instead of CTEs
  • use a stored procedure that contains the query and reference the stored proc in the source of the dataflow
  • write the query as a view and reference the view in the source of the dataflow (this is my preferred method and the one I will demo here)

I will use the following query purely as an example to demo this:

This query produces the following output:

If I write this query directly inside the source of the data flow as shown below, when trying to import the schema or preview the data I get the error message Incorrect syntax near the keyword ‘WITH’

However, if I create a view for this query and reference the view in the dataflow instead, this works and I can preview the data:

Note, in the source you can also write a query referencing the view (shown below), this is useful if you require additional logic on top of the view.

As mentioned earlier, you can also write the query in a stored procedure and reference this in a similar way to the above. You should now be able to use this simple method to use CTEs in the source for a dataflow.

Power BI Sentinel
The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Lewis Prince
Reiss McSporran
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon