0845 643 64 63

Making the DAX Engine Work for You

In my last blog post which can be found here, I demonstrated that it’s important to limit your DAX filters to specific columns rather than entire tables, now we know that we can and should do it, let’s explore why and how it works.

There are numerous ways to write an expression that will give the same result set, but why are some ways more efficient than others? To answer that we need to know a little about what happens under the hood, inside the DAX engine(s).

A lot goes into processing a DAX expression but the focus of this blog post will be the Formula Engine (FE) and the Storage Engine (SE).

Both engines do different jobs, think of the FE as the brain, it provides the instructions, and the SE as the muscles, it does the fetching and carrying.

The FE takes a DAX expression and turns it into a set of sequential instructions, (joins, aggregations, filters, etc). It then sends these instructions to the SE. Due to the single-thread nature of the FE to SE communication, the instructions are sent to the SE sequentially. Therefore, the more instructions sent, the longer the entire process takes.

The SE takes these instructions and executes them against the data source(s), collects what is requested and delivers it back to the FE. Unlike the FE, the SE uses multi-threading so can process the FE’s requests very quickly.

So now that we know a little about the FE and SE and what roles they perform, how can we use it to our advantage?

The SE can do a lot of complex work very quickly, but if we can limit the amount of instructions it has to carry out, the quicker the overall process becomes. This is best shown with an example.

*The queries below are being run in DAX studio so the stats can be recorded alongside the result set

Here is a simple expression to select the total amount of sales, per region, taken from everyone’s favourite imaginary online bike store:

DEFINE MEASURE Sales[Count] =
CALCULATE(
DISTINCTCOUNT('Sales Order'[Sales Order]
)
)
EVALUATE
CALCULATETABLE(SUMMARIZECOLUMNS(Customer[Country-Region], "Sales by Region", Sales[Count]))

As you can see, the FE takes the DAX expression and turns it into 1 single query. The instruction is sent, the SE does the hard work and passes the result back. The entire query took 44ms to return 7 rows, one per region. Nice and fast.

However, if we add a filter to the query and use the entire Sales table (highlighted in red below) in the filter, look what happens:

DEFINE MEASURE Sales[Count] =
CALCULATE(
DISTINCTCOUNT('Sales Order'[Sales Order]),
FILTER(
Sales, Sales[Sales Amount] > 350
&& Sales[OrderDateKey] > 20200101
)
)
EVALUATE
CALCULATETABLE(SUMMARIZECOLUMNS(Customer[Country-Region], "Sales by Region", Sales[Count]))

The expression now takes 8 times longer to complete. The reason for this is the FE sends the SE 8 queries, and remember, it must do these sequentially.

1x The SE fetches the table for the filter (Sales) and trims it down to only the rows that meet the filter criteria:
a. Sales Amount > 250
b. OrderDateKey > 20200101

7x That filtered table, which is now held in memory, is applied across 7 more queries, one query per region in the Customer table.

There are 8 times as many instructions sent to the SE and the whole process takes around 8 times as long to complete.

With a little knowhow we can rewrite the expression in a more efficient way, using only the specific columns in the filter (again, in red). This enables the FE to parse it into a single instruction and ping it off to the SE.

DEFINE MEASURE Sales[Count] =
CALCULATE(
DISTINCTCOUNT('Sales Order'[Sales Order]),
FILTER(
ALL(Sales[Sales Amount]), Sales[Sales Amount] > 350),
FILTER(
ALL(Sales[OrderDateKey]), Sales[OrderDateKey] > 20200101)
)
EVALUATE
CALCULATETABLE(SUMMARIZECOLUMNS(Customer[Country-Region], "Sales by Region", Sales[Count]))

The SE receives 1 instruction, produces the same result set and the entire process takes a fraction of the time.

Although the technology behind DAX is very quick, learning the most efficient ways to write expressions can give that technology a helping hand and make your expressions, and therefore your Power BI reports, just that bit quicker.

If you need any help or advice with your DAX expressions or Power BI reports in general, feel free to leave a comment or drop me or any of the team an email with your queries, we would love to hear from you.

Variables in DAX

Variables can simplify your DAX code, help with debugging and help improve performance. To use variables in your DAX measure you need to declare a variable using the VAR […] Continue Reading…

1 2 3 15
Power BI Sentinel

The Frog Blog

Team Purple Frog specialise in designing and implementing Microsoft Data Analytics solutions, including Data Warehouses, Cubes, SQL Server, SSIS, ADF, SSAS, Power BI, MDX, DAX, Machine Learning and more.

This is a collection of thoughts, ramblings and ideas that we think would be useful to share.

Authors:

Alex Whittles
(MVP)
Reiss McSporran
Jeet Kainth
Jon Fletcher
Nick Edwards
Joe Billingham
Microsoft Gold Partner

Data Platform MVP

Power BI Sentinel
Frog Blog Out
twitter
rssicon