“Will that break a query folding?” “Does your query fold?”… Maybe someone asked you those questions, but you were like: “Query…Whaaaat?!”
Or, maybe you’ve heard about query folding in Power BI, but didn’t know how to take advantage of it in real-life scenarios.
If you recognized yourself in (at least) one of the two situations specified above, then please continue reading this short series of blog posts!
In the previous part of the series, we explained what is query folding, which data sources support it, and which transformations within the data source itself. Now, let’s examine why is it important to achieve this behavior – or, maybe it’s better to say – why should you care if the query folds or not?
Why should you care about Query folding?
When you’re using Import mode in Power BI, the data refresh process will work more efficiently when the query folds, both in terms of refresh speed and resource consumption.
If you are working with DirectQuery or Dual storage mode, as you are targeting SQL database directly, all your transformations MUST fold – or your solution will not work.
Finally, query folding is also of key importance for Incremental refresh – it’s so important that Power BI will warn you once it determines that query folding can’t be achieved. It will not break your incremental refresh “per-se”, but without query folding in place, incremental refresh wouldn’t serve its main purpose – to reduce the amount of data that needs to be refreshed in your data model – as without query folding, Mashup engine needs to retrieve all data from the source and then apply subsequent steps to filter the data.
With all these in mind, you should tend to achieve query folding whenever possible.
Slow report – don’t blame Query folding!
One important disclaimer here, and this is one of the key takeaways from this series of blog posts: if your report is slow, or your visuals need a lot of time to render, or your data model size is large, query folding has nothing to do with it!
Only if your data refresh or incremental refresh is slow and inefficient, you should investigate your Power Query steps in more depth.
All or nothing?
Few more things to keep in mind regarding query folding. It’s not all or nothing process. That means if you have, let’s say, 10 transformation steps within Power Query, and your query folds until the 6th step, you will still get some benefit from partial query folding. However, once the query folding is broken, it can’t be achieved anymore.
To simplify, if you have 10 transformation steps, and your query folding is broken in the 5th step, all previous steps will fold, but once the folding is broken, it can’t be achieved again, even if you have transformations that support query folding by default in steps 6 to 10 – like in our example where filtering should be a foldable step, these steps will not fold. Keep that in mind, and try to push all non-foldable steps down the pipeline as much as possible.
How do you know if the query folds?
Ok, now we are not rookies anymore. We know what is query folding, why should we strive to achieve it, and some subtle tricks that can make a huge difference.
Now, it’s time to learn how can we check if the specific query folds or not. The first and most obvious way is to right-click on the step and check how the View Native Query option looks like.
If it’s greyed out, this step (probably) not folds. On the other hand, if you are able to click on this option, that means that your query, again probably, will fold. I guess you are maybe confused with the word: PROBABLY!
But, that’s the proper word, as you can’t be 100% sure that if the View Native Query option is disabled, that your query doesn’t fold. I will show you later how can this option trick us to think that the query folding was broken, even though that in reality folding actually occurs.
Instead, when you want to be sure if your query folds or not, you can use the Query Diagnostics feature within Power Query Editor, or SQL Server Profiler, like a good old and reliable way to check the queries sent to a database by the Power BI engine.
Additionally, there is a cool new feature in Power Query Online, where each step is marked with the icon that shows if that step folds, not folds, or unknown. As I said, this feature is available only in Power Query Online at this moment, so let’s hope that Power BI team will implement it in the Desktop version soon.
Devil is in the detail…
Fine…You’ve probably heard about the saying that the devil is in the detail. Now, it’s time to understand how little nuances can make a big difference in our data transformation process.
Let’s start with one of the most curious cases in Power Query editor…
Devil #1 – Merge Join
This one is very interesting, as you will hardly assume what is happening in the background. Let’s say that I want to combine two of my queries into one. I will use the Adventure Works sample database, and I need to merge FactInternet Sales and DimCustomer tables.
I’ll remove some of the columns from my fact table, and keep only CustomerKey column, as this is a foreign key to a DimCustomer table, and Sales Amount column. I will join DimCustomer table as it is, without any additional steps before merging.
Merging tables is equivalent to JOIN operation in SQL. Essentially, we choose the column on which we want to perform MERGE operation, and the type of join (left, outer, or inner).
The problem is that by default, when you’re merging two queries, Power Query will generate a nested join statement, which can’t be properly translated in SQL.
If I go to the Tools tab and click on Diagnose Step, I can see that the Mashup engine fired two separate queries to my underlying SQL Server database – in other words, those two queries couldn’t be executed as a single SQL statement, and that means that query didn’t fold!
How do we solve this? Let’s just choose a blank query and write our M code by hand, to achieve exactly the same result.
The key thing is that we will use a similar, but still different M function: Table.Join.
All function arguments are exactly the same as previously, and let’s now check the outcome.
You remember once I told you that when the View Native Query is greyed out, your query probably doesn’t fold, but it’s not 100% correct. And, this is a good example. If you take a look at View Native Query, it still shows that our query doesn’t fold…
…but let’s go to Diagnostics and check if that’s true.
Oh, boy, we were tricked – this step indeed folded! As you can see in the illustration above, we have one single SQL query generated and sent to a SQL Server source database to be executed.
So, we found two devils in this example – the first one was join type, which we were able to solve by tweaking automatically generated M code. And, the other one was the incorrect behavior of the View Native Query option. I’ll show you in the next part of the series one more example when View Native Query lies.
As we laid a solid theoretical background about Query folding in the previous part of the series, we’ve deepen our understanding of query folding concept’s importance.
Moreover, we’ve seen how some subtle differences can make a huge impact on the process of translating M code to SQL, and learned how to solve one of the most common challenges when it comes to query (non)folding – and that is a MERGE operation!
In the next part of the series, I’ll show you a few more handy tips related to Query folding.
If you are impatient to learn more about Query folding, I strongly recommend checking “30 days of Query folding challenge”, created by Alex Powers, as this is probably the most comprehensive resource for understanding this concept.
Last Updated on March 9, 2021 by Nikola