Thanks Nikola!
I disagree with you. I think that DATETIME2 still has some important benefits, just like you mentioned – less storage, bigger precision, larger date range and being SQL standard compliant. I think that the only meaningful drawback of DATETIME2 is the bug in the cardinality estimator when associated with the DATEADD function. This is a big deal, but it can be avoided by using a DATETIME value in the relevant expression inside the DATEADD function, and then converting back to DATETIME2, just like Paul White suggested.
I agree with your disagreement:)…I was also wholeheartedly on Datetime2 in the beginning, exactly because of all advantages you mentioned, but in the end I’m not sure which type is “better”. Hmmm, for larger date range and being SQL standard compliant, I don’t consider these as so significant advantages, since I really don’t see many scenarios when that can make big impact (of course, maybe I’m just assuming wrong). But, for bigger precision and memory consumption – fully agree with you, those are really important advantages, which one should seriously consider when making design decisions.
I guess, as in 99.9% of situations when it comes to SQL Server, the correct answer is: “It depends:)”
Thanks again for your opinion, really appreciate that.
This is far better dimdate function that I have been using since 2017. It creates a custom function with date from and to plus u can specify the local language for days and months.
let CreateDateTable = (StartDate as date, EndDate as date, optional Culture as nullable text) as table =>
let
DayCount = Duration.Days(Duration.From(EndDate – StartDate)),
Source = List.Dates(StartDate,DayCount,#duration(1,0,0,0)),
TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),
ChangedType = Table.TransformColumnTypes(TableFromList,{{“Column1”, type date}}),
RenamedColumns = Table.RenameColumns(ChangedType,{{“Column1”, “Date”}}),
InsertYear = Table.AddColumn(RenamedColumns, “Year”, each Date.Year([Date])),
InsertQuarter = Table.AddColumn(InsertYear, “QuarterOfYear”, each Date.QuarterOfYear([Date])),
InsertMonth = Table.AddColumn(InsertQuarter, “MonthOfYear”, each Date.Month([Date])),
InsertDay = Table.AddColumn(InsertMonth, “DayOfMonth”, each Date.Day([Date])),
InsertDayInt = Table.AddColumn(InsertDay, “DateInt”, each [Year] * 10000 + [MonthOfYear] * 100 + [DayOfMonth]),
InsertMonthName = Table.AddColumn(InsertDayInt, “MonthName”, each Date.ToText([Date], “MMMM”, Culture), type text),
InsertCalendarMonth = Table.AddColumn(InsertMonthName, “MonthInCalendar”, each (try(Text.Range([MonthName],0,3)) otherwise [MonthName]) & ” ” & Number.ToText([Year])),
InsertCalendarQtr = Table.AddColumn(InsertCalendarMonth, “QuarterInCalendar”, each “Q” & Number.ToText([QuarterOfYear]) & ” ” & Number.ToText([Year])),
InsertDayWeek = Table.AddColumn(InsertCalendarQtr, “DayInWeek”, each Date.DayOfWeek([Date])),
InsertDayName = Table.AddColumn(InsertDayWeek, “DayOfWeekName”, each Date.ToText([Date], “dddd”, Culture), type text),
InsertWeekEnding = Table.AddColumn(InsertDayName, “WeekEnding”, each Date.EndOfWeek([Date]), type date)
in
InsertWeekEnding
in
CreateDateTable
If you don’t need the time part, then store it as a Date only, since there will be less distinct values and therefore data compression will be better and your data model size will be lower.
In any case, don’t forget to change the data type in Power Query Editor, BEFORE you load data into your report. Once you load it to a report, you can just change the formatting to exclude time part, but underlying data model will still keep the time part.
Thank Sergio.
Very interesting this article.
I have a question. These recommendations also could use in other visualizator like Qlik, right?
Regards from Perú
Some of these rules are generally applicable (like, for example, eliminating columns/rows you don’t need), but I really don’t have any experience with Qlik, so I don’t know how it stores or compresses the data. The example from my article relates to the Tabular model, which runs in the background of SSAS Tabular, Power BI, and Power Pivot for Excel.
You could always group the slicers with two small shapes at either end of your panel to create the width and height you want. Go to the selection pane hide the shapes so they are not visible to the end user. You can then add a background to the group. No bookmark needed.
Nice tip. Now to push for more.
How can you have CONCATENATE X to substitute “and” for the last comma? Instead on getting “Apples, Bananas, Pears”, you get “Apples, Bananas and Pears”.
You have to use SUBSTITUTE with the last parameters, [InstanceNumber] being equal to DISTINCTCOUNT of the given column minus 1 (this means the last position in the serie). This is an example :
Selected Brands = SUBSTITUTE( [YOUR_CONCATENATE_FORMULA], “,” , ” and”, DISTINCTCOUNT(‘Product'[BrandName]) – 1 )
You really make it seem really easy with your presentation however I to find this topic to be really
something which I believe I might never understand. It seems too complicated and very huge for me.
I am looking ahead for your next submit, I’ll attempt
to get the grasp of it!
Thanks! It’s really not that hard, just try to understand general data modeling principles (Kimball methodology): After that, I’m sure you will get to grasp with specific related topics.
Hmm it seems like your blog ate my first comment (it was
extremely long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog.
I too am an aspiring blog writer but I’m still new to everything.
Do you have any helpful hints for newbie blog writers?
I’d really appreciate it.
This is very fascinating, You’re a very skilled blogger. I have joined your rss feed and
sit up for in quest of extra of your excellent post.
Additionally, I have shared your site in my social networks
Nice article. Agree with the way you mentioned here. What happens when user says they want the same conditional formatting in the exported Excel as well??
Yeah, unless MS finds a brilliant way to unify how conditional formatting works and is written in both pieces of software, it seems like that may never happen unfortunately. Even the Export to Excel feature is only squeezing out the data only… Excel wouldn’t be able to read how PBI encodes its calculation engine.
Don’t get me wrong, Excel has through Power Pivot the wherewithal to understand all that, but it doesn’t appear MS is interested in wiring all these similar systems together. I hope I’m wrong. I hope they come out with Excel 2022 and say: “Hey, we’re merging Excel and Power BI’s differing functionalities into one… Power Excel BI!!!”
I’ve been using bookmarks as navigational tools on paginated reports but never considered using books marks in place of static tooltips. This is definitely going to raise my visual game, thank you!
Thank you! I will be using this approach in my upcoming project.
Comment on the following;
“Before we conclude, one important remark: in order for this trick to work in Power BI Service, you need to switch the function in My Tasks measure, and use USERPRINCIPALNAME() instead of USERNAME():”
Shouldn’t it say “use User Service instead of User Local”?
I was just exploring this option last week, and I’m so happy that GITC pointed me here! Really great example – and more importantly I can copy-paste this 🙂 “I’m not lazy, just efficient”, right?
One thing, which in my case helps a bit more is the below (sorry for the formatting in advance):
My Customers =
VAR user = USERNAME()
VAR uName = LOOKUPVALUE(‘User Directory'[Surname],’User Directory'[E-mail],user)
The reason behind that is we can have multiple SalesReps linked to a single Customer. Of course, this makes it a bit more challenging to apply page level filters (or maybe I’m doing something wrong), but can apply this filter on each visual on the page.
With the bookmarks and buttons of course!
Thanks for sharing useful insights on this important topic. With regard to guid if we do not have option to change source structure we have to bring in guid and to improve performance trying to add index column and join based on that. What we found was this caused query folding not to work any further. Any thoughts?
As far as I know, there is no proper way to optimize GUID performance once you import it into the Power BI. Query folding will not work in cases when your Power Query transformations can’t be “translated” to a source language (mostly SQL).
I actually have a quick question/potential article topic. I was working on a report this week and I wanted to create a 24 hour moving average measure and I realized a lot of the time intelligence function seem to only go down to the hour granularity. So I was wondering how do you handle hourly data in Power BI? I didn’t find a lot of resources online on how to tackle that.
Good question! I believe you’ve wanted to say that a lot of the time intelligence functions go down to the DAY level of granularity, instead of HOUR:)…I think you can’t use built-in time intelligence functions for hour calculations, but all of these built-in functions can be rewritten in “normal” DAX. Honestly, I don’t know from the top of my head how to handle this request, but you are right – it’s a good idea for one of the next articles:)
[…] should have noticed that Microsoft made two important announcements a few weeks ago: the first, Azure Synapse Analytics became GA, while the other also caused a lot of interest: by introducing Azure Purview as the […]
[…] report performance even if they aren’t displaying any data; several other people have written similar posts too. Small multiples are just another way you can replace several visuals with a single visual […]
simple but super useful.. Thank you. this is a great blog. the synapse bit is of less use to me, the development and report building is certainly something i enjoyed reading
This is good news, and hard to find resources on this for PBI Report Server, so thank you! I have a question about where you enter the query. I need to set this up for my team but I’m missing information on how to access these logs. Do you enter the queries you show from inside Power BI Desktop (for server), or do you sign on from a url, or maybe MSSMS Report Server Configuration Manager?
Awesome, indeed! Use it all the time… I see on your sample Report, you have a Year, and Month Slicer, each being a dropdown… I use the same, but additionally, I have a Date-Range slicer, that allows for getting to a custom date-range, or maybe specific days…
I have seen that when I use such, as your Year and Month dropdown filters…and then changing to a specific day, then going back to changing the Year or Month, only the Start-Date of the Date-Range slicer updates – I’ve checked “Edit Interactions”, so that should allow for perfect interaction between the different Date slicers…which all comes of my Date-Dimension table, with hierarchies… I’m just wondering, if it is a bug in Power BI with the slicer itself…it’s not a showstopper, but annoying…a quick reset filters bookmark, or even the clear button on top-right of filter sorts it out…it’s just not the best user experience alltoghether…
[…] Extend Power BI theme with an image! – Nikola Ilic (@DataMozart), MCT, has blogged a really great trick to add images to a Power BI theme to automate adding images to Power BI report pages. Read more. […]
[…] 📊 Power BI : Bien démarrer avec Azure Purview pour Power BIhttps://data-mozart.com/hitchhikers-guide-to-azure-purview-for-power-bi-professionals/ […]
Hi Nikola,
I am preparing for more than a month and relying on the practice tests. Can you suggest any other things apart from Microsoft’s online training material? Do you think the MS official practice test is useful? I already finished Steven Parker’s videos after taking your suggestion and it was helpful. Thanks
I took the exam while it was still in beta, so there were not many resources out there (even MS Learn path came later). I don’t have experience with the MS official practice test, so I can’t say if it’s useful – probably yes, but you should maybe check that with someone who tried it.
I think Parker’s training is excellent for someone trying to complete DA-100, and I believe you will do great! Good luck with your exam!
I am now getting curious if it is possible to choose both dimensions and put them both in a different column so the data will be grouped by both chosen dimensions.
[…] rows will be 500 times the number of scheduled activities. Still a tool like Power BI should still make quick work of this. Again this query uses the prior as the […]
“Hot” data is loaded into Power BI (in the aggregated table), but yes, it “lives” also in on-prem SQL Server. Essentially, it is physically just one big fact table in the SQL Server database, and it is split into two for Power BI – “hot” data goes in Import mode, “cold” data stays in SQL Server. Aggregated tables over “cold” data are imported into Power BI too, as they are not memory exhaustive, and satisfy 95% of the reporting requests.
Thanks again Nikola for the detailed explanation – I understood now properly.
I was surprised at first glance as I thought you used Direct Query directly against the on-prem SSAS MD, which is at this time not supported (as I know).
Another question: are there features you used here in preview mode, or all included in GA? Or, in other words: is this approach safe to be used on a productive customer environment?
You’re right, Direct Query over SSAS MD is not supported (only Live Connection). Regarding your question: all the features are GA, so you can freely use them in a production environment.
It seems a screenshot is missing at Based on this value, I will determine if the product belongs to High, Medium or Low group. To achieve that, SWITCH function comes in handy:
Nice blog post Nikola! Just to be correct, you can use implicit measures in multiple different aggregation types.
You just have to drag the column several times into the value area and assign different aggragations to it.
You can do it at both places. When I write a native SQL query to import the data, I’m removing unnecessary rows and columns on the source side, to preserve the query folding afterward. While, when I import the table in a “regular” way, I’m filtering data in the Power Query editor.
Love it.
Can you make a blog post on how to calculate Mode (most frequent values) in different granular detail? Such as by order, by product, by category, by the whole, assuming the order qty is whole number. Each order can have multiple lines for the same product or different products.
Thanks for sharing.
I have one question, I can see you have many id columns in your tables. What If you just « deleted » those unused column Id to reduce the model size ?
How do you proceed If you want to calculate a year to date ?
Great point! For this demo, I’ve removed multiple dimension tables from the model, but forgot to remove the foreign keys (ID columns) from the fact table. As you rightly said, that would reduce the model size even more.
Regarding YTD calculation – that should not be a problem until you are using the measure that hits the imported or aggregated tables.
[…] over a specific visual, the respective action should be displayed! I’ve already written how to enhance your report using tooltip pages, and here we will follow a similar […]
Wow, I didn’t know this. However in Data Science its always advisable to clean up tables, replacing Blanks with O’s. Because that could enable more complex algorithm calculations later on. So it depends on your usage of data.
In an ideal reporting solution Data Marts for the specific period should be in place. To be honest on performance based Table will def have better performance if you can load the whole view data rather than joining but if its a timely refresh of data we can always refresh at a non peak time of the report.
Great post, Nikola! I am playing with this new visual now to build a case for one of my customers and everything looks perfect until I publish the report to the service. Then it says that the report couldn’t access the data source and it doesn’t render. Have you stumbled upon such issue? I am thinking it might be because the source for my paginated report is an SSAS tabular database but then again, it doesn’t make sense to work in PBI Desktop and fail in the service.
Hello Nikola, this is amazing! I have a couple of charts that will greatly benefit from this. However, when I try to apply the logic to a data point that is not a measure, it doesn’t work. Is there something I can do to convert the data to a measure or a different dax code?
As you said views are stored query upon multiple tables that execute upon calling the view, ok then what will happen when a view is joined to another table to get a result set? Will the stored query need to execute for each record in the join with the table?
Is this a best practice? Or what Is the best practice in this situation?
Select * from tableA a join viewA b
On a.id=b.id
Will viewA stored query will execute for every join records from tableA???
[…] 3 reasons to use views instead of tables in Power BI!In recent months, I was frequently asked if one should use tables or views when IMPORTING data in Power BI. I’ve intentionally marked word importing, as the main purpose of this article is to compare usage of views vs tables in Import mode. […]
In a dataset if there are parameters created using Manage Parameters option in edit query. How can we add or utilize those Parameters to our Native query?
Can partitions come from different tables?
Eg your import partition is one table and the DQ one is a different table (but they both have near-identical structures)?
If you then set up your ETL so your DQ-table is always the differential vs the import table, the scan to the DQ-table is not that costly…
Just don’t know if that is technically possible….
Hi Nikola, happy you write this article!! Trying to understand. So:
– scenario 1 is where the direct query data is recent/hot and the imported data is cold
– scenario 2 is where the imported, aggregated data is recent/hot and the directquery data is cold?
What if you have a composite model, that is only refreshed at the beginning of day (so no real-time data) whereby the requirement is that I want to have insight in aggregated (import) and very recent detailed (direct query) data ?
Is it then better just to stick to the composite model?
If you’re using a composite model where tables are with different granularity (I’m assuming the import data is aggregated per year/month/product/whatever…and DQ table goes to the lowest level of granularity), then I’d say you should stick with composite model. Because, in the end, a Hybrid table is just a table: so, you can’t define different granularities within one single table.
Nice piece.
I find this a really interesting question and have come to the opposite conclusion – go with 2 actors as the default. It seems this puts me out of step with most experts on the topic but my reasoning is all about ease of use for self-service. I find that a big benefit of 2 actors is you can name the columns: Bet Placed Month, Bet Processed Year, etc etc. Much clearer for the end user to build a visual, and much clearer for the report consumer to have a visual with Title and axis labelled unambiguously.
To me the extra training of end users comes with the 1 actor approach. Having to explain that if you want to slice by date you have to pick the right measure, but you can choose any measure if you want to slice by product is what confuses. Also, ending up with a visual that, for example, counts bets by product that then mentions (Date Processed) in the Title when the date is totally irrelevant leads to confusion too.
No issues with the “2 actors” approach at all! It just depends on a specific use case, and evaluating all pros and cons that come with any of these two solutions in that specific case.
Great Article, thanks for clarifying how 2nd Scenario should be configured in practice, as I immediately imagined this version while learning about Hybrid table preview. I only have one question – Tabular Editor was raising an error until I set an import partition Data View to Sample mode. Do you know why it was needed and if there are any consequences related?
Doesn`t matter, just found that the problem was caused by an incorrect Compatibility Mode. When I switched from 1500 to 1565, it is all working as expected, so I am just leaving a hint to the others. Great feature!
Thanks for posting this, Nikola! The requirement of slicers reacting to other slicers is not new and the common go-to solution for new Power BI developers is to accomplish it with bi-directional relationships. Somehow, the slicer filtering by non-blank measure results isn’t yet widely popular, so thank you for bringing it to everyone’s attention.
Hi,
we are trying to apply scenario #2 on our power bi data model. I would like to ask what is/should be the settings of Data view for each partition.. Because we created 2 partitions – import with data view Full and DirectQuery with data view Sample (as Full or Default is not posible to choose). After finishing the full process of appropriate table in SSMS, only data from DirectQuery are displayed in report/excel. Can it be caused by partition setting on Sample data view? Or do you have any idea where could be the problem? Thanks.
Hi Nicola,
Question I tried Page Refresh with DirectQuery to a SSAS cube on Prem; But this doesn’t give me the desired result. I think this is not yet supported and what you show only DirectQuery to a database is supported. Is this correct?
To test this; i have created a PBI report, connected it to a SSAS cube; set the page refresh to 1 second; uploaded it to PBI cloud; Opened dax studio and turned the profiler on; If I open the report this registers my queries; but I can’t see the queries from the page refresh; and if I update the cube on the background there is no update in my PBI report.
Thanks for your comment. You can also use Reference (I guess you thought Reference when you said link), but my preferred choice is almost always Duplicate. Let me explain why:
Duplicate creates a brand new copy with all the existing steps and a new copy is completely independent of the original query. You can freely make changes in both the original and new query, and they will NOT affect each other. On the other side, Reference is a new copy with one single step: retrieving the data from the original query. If you change the original query, let’s say, remove the column, the new query will not have it anymore.
If you use a Reference, then there are problems with the preview.
When the first table is the main table, you cannot see the columns of the referencing query and accordingly select them for the join. You can only do the opposite, but it makes no sense.
I hope you understand, I’m not an English speaker)
[…] optimal condition — which means, reduce the data model size whenever possible! This will enable VertiPaq’s Storage Engine to work in a more efficient way when retrieving the data for your […]
This is excellent. Till now I was in an assumption that Vertipaq plays role in both Import and Direct Query, but with this I got clarity on the Row mode storage and Columnar storage too. Thanks for the best one.
Hi Nicola,
This is wonderful article, I am currently Using SSAS import mode to get 10 million rows of data and it’s super slow. Is there is any suggestion or blogs you have on the site to refer to in order to deal with long time loading queries, for now I have splitted data in multi queries set using SQL LAG , and then appending all table as new query.
It’s hard to answer without knowing more detail. Generally speaking, if you’re facing issues with slow data refresh process, I’d strongly recommend checking Phil Seamark’s blog about visualizing the data refresh process. Then, you can identify which part (or which query) of the process is the bottleneck and check if something can be done to optimize it.
Hi Nikola,
Great article, thank you. One comment, I see you don’t include the publishing date of your blog post, do you see that better than including it?
Regards,
Fernando
Thanks for your feedback! Actually, the original published date is included on the main page, while I keep the “Updated date” within the article itself (at the bottom), as it sometimes happens that I apply modifications to the article.
Hi Nikola, this article (and all the others in the DP-500 series) is really helpful. Just one question if that’s ok, why is the [factonlinesales] table in direct query mode? Can you end up in a scenario where the data in [factonlinesales] table and [Sales Product Agg] don’t balance as the direct query tables have the latest data and the import tables might not have been refreshed. Thanks Phil
Thanks! For User-defined aggregations in Power BI to work, the detailed table (which is FactOnlineSales in our example) MUST be in DirectQuery storage mode. If you want a “one table fits all” solution, then I’d recommend taking a look at the Hybrid tables feature, which combines DirectQuery and Import within one single table! I wrote about this feature here: https://data-mozart.com/hybrid-tables-in-power-bi-the-ultimate-guide/
Data Refresh
– Query Dependencies
Level of dependencies. I have seen queries with more than 4 levels of depth cause time to refresh
– Size.
Most of the time we don’t need all data, but we bring in those needed.
Data Model
– Data types
– Role Playing Date Tables or other Dimensions.
Note: Date tables are getting internally created if we don’t have date tables and linked to them.
– Relationships.
Some unwanted relationships can be deleted.
Some may need relationships
– Data formats and default summarize by
– Hide columns that are not needed (Hide in report view)
DAX
– Check the DAX generated using Performance Analyzer. (You placed Stop or Red icon not sure whether you meant to check the queries or ignore it.)
– Sync slicers
– Bookmarks
Visuals
– Pay attention to number of elements displayer or retrieved
External Tools
– ALM Toolkit
– Model Documenter
– Dax Studio (You placed Stop or Red icon not sure whether you meant to check the queries or ignore it.)
– Tabular Editor
We have lot more to consider. I gave those within your limit of diagram focus and these will enhance by adding to the diagram
[…] You can find more info about Power BI field parameters in the official blog post, but also here, here and here. The use case I’m talking about is slowly changing dimensions of Type 2, you know, the […]
I have mixed feelings about allowing personalization. I don’t necessarily want to have to answer questions about changes that someone else has made. What are your thoughts about supporting personalizations?
Thanks for your comment, Jessica. Generally, I like them and I think it’s an undervalued feature in Power BI. However, I understand your concerns. In the end, I guess it’s all about educating report consumers on how to use certain features and functionalities:)
[…] there were many “game changers” — features that brought revolution instead of evolution! Composite models, Dataflows, AI-powered visuals, Hybrid tables, Bookmarks…You name […]
Hi Nicola,
This is wonderful and thanks for sharing. Can I know is it unique dates on ‘Date UTC Agg’ table? How to make this table as ‘Many’ relationship to ‘One’ on DimDate table?
My source SQL query contains a transaction isolation level modifier READ UNCOMMITED. With it, the query does not fold. Without it – it does fold. Do you have an idea why this is the case? I can find some sense in presuming that folding will lift more load off the server than the transaction isolation level modifier but still.
Hi
I have one query.
I created two partitions on my table hot data partition and cold data partition with import mode and direct query mode respectively. But when I filter the records from Hot data partition, it still hits the direct query. Does it mean that Import query mode is not working at all in this scenario?
Thanks,
Ani
Even though you’re retrieving the data from the import mode table, the DirectQuery query will still be generated (even though it should return an empty result set). You can check the exact query in SQL Server Profiler.
[…] data, reduced the cardinality wherever was feasible, adjusted column data types, and applied some additional recommended practices…And, then, I remembered Benni’s tip and decided to give it a shot and see what happens if I […]
[…] data, reduced the cardinality wherever was feasible, adjusted column data types, and applied some additional recommended practices…And, then, I remembered Benni’s tip and decided to give it a shot and see what happens if I […]
1. Why would a denormalized model be more efficient in terms of storage? I would except the opposite as I have redundant data
2. Why less joins reduce footprint in memory?
1. In the Tabular model, VertiPaq stores compressed data. So, if you don’t have many distinct values in the column (let’s say a column with 50.000 records and 20 distinct values), this column will be highly-compressed and therefore potentially consume way less memory than the relationship between two separate tables (relationships also have memory footprint). Of course, that’s not always the case, but potentially could be.
2. Please see my answer on 1. Relationships also consume memory in the data model, so getting rid of joins means you are also getting rid of these relationships.
[…] two key components of the underlying architecture — Formula engine and Storage engine — in this article, and I strongly encourage you to read it before proceeding further, as it will help you to get a […]
Hi Nikola,
Thank you for your work, and all your articles. They are very clear and helpful.
My question is close to the previous one.
We always discuss Aggregation with both import and DQ tables. Is there any sens doing it with only import tables to reduce the granularity scanned on most measures but still being able to deep dive, while avoid multiple switch measure checking with table should be queried.
Hi Olivier,
That’s a fair question. However, user-defined aggregations in Power BI work ONLY and only if the main table (fact table) is in DQ mode. Hope this helps.
Cheers,
Nikola
Hi Nikola, I have an issue with Power Apps visual refreshing when I navigate between different report pages. Do you think this will potenitally fix this issue? I want the Power Apps visual not to start the app over and over again…
The content of the article remains too thin compared to an ambitious title like “Everything you need to know”
Would it be possible to update the article by including the following points:
-Scenario where a workspace is connected to an ADLS Gen2
-Consumption of dataflows, especially in a datamart
-incremental update
– access permissions, in particular permission to modify a dataflow
Thanks for your comment.
These are all valid points, but it would probably be a chapter in the book if I covered them all in one article:)…I’ll cover some of these topics in the next articles.
[…] in a role-playing dimension scenario. Unlike in one of the previous articles, where I explained how to handle role-playing dimensions using the USERELATIONSHIP() function to change the active relationship between the tables, I’ll […]
Hi Nikola, thanks for all your interesting and great info, support, documents, trainings, etc
I find them very inspiring.
Have a great christmas time amd new year!
Suzan van Echtelt from the Netherlands
Created new dataflow using SQL server, need to refresh the dataset everytime after dataflow refresh or automatically the data will update on dataset?
In this case both dataflow and dataset schedule refresh is needed? even my dataset is only connected to dataflow?
[…] functions extensively for various analytic tasks. I’ve described one of the possible use cases in this article, but there are literally dozen of scenarios that can be quickly and intuitively solved by using […]
It’s not a good practice to use views over tables if you are using as is data – regardless of whether you are using import or direct query. As you mentioned in the article, views are nothing but a stored query which is run every time data is called. This means database resource is used each time – and can cause substantial slowdown and cost increase which could totally have been avoided by using tables directly. I read your point on using views if the amount of columns / data is significantly less than what’s on the parent table – I agree with it, but it should be a very specific scenario. It’s almost always a better practice to use tables over views if you are using data as is.
Hi Dev,
Thanks for your comment.
I talked about using the data from the SINGLE table “as-is” (no joins, calculations, etc.). So, only omit unnecessary columns and/or change column names.
Let’s say that my view A returns 10 out of 15 columns from table B (5 columns are not necessary for reporting purposes). Why do you think that something like: SELECT col1,col2,col3…col10 FROM view A will be more inefficient than running SELECT col1,col2,col3…col10 FROM table B? There’s no difference in terms of performance, costs, etc.
Also, if you import the data to Power BI, why import table B with 15 columns and then remove 5 of them using Power Query, if you can clean up stuff in the database and import view A with only 10 necessary columns? Views, same as tables, support query folding, so no difference from that perspective as well.
Thank you for the post!
However, some formulas do not work for me. Below is my calculation of the AVG days between orders. I have used your formula, but the error I am getting is: “The column [NUmber of days btw Orders] specified in the SUMMARIZE function was not found in the input table”. Because it’s a measure, it’s not a column, I guess. But how did it work in your formula?
AVG Days btw Orders = CALCULATE(AVERAGEX(
SUMMARIZE(
ALLSELECTED(Orders), Orders[Customer Name], Orders[Order Date],[NUmber of days btw Orders]),
WINDOW(1,ABS,-1,ABS, SUMMARIZE(ALLSELECTED(Orders),Orders[Customer Name],Orders[Order Date]),ORDERBY(Orders[Order Date]),KEEP,PARTITIONBY(Orders[Customer Name]))))
Hi Marina,
Thanks for your comment.
Please check the version of the Power BI Desktop you are using. These functions are supported from the December 2022 version of PBI Desktop.
Hope this helps.
Hello, I’ve noticed a lot of confusion regarding the location of the data on a dedicated SQL pool. I’ve read in some places that the data is kept inside the database of the dedicated SQL pool, but as you mentioned here and it states in the Microsoft documentation, the tables’ data are kept on a data storage outside of the dedicated pool. What are your thoughts on this?
Thanks for the great article!!
Can you extend your comparison between Power Bi data flows and ADF wrangling data flow to consider Azure Synapse Data Flow?
I’m just surprised with the ADF limitations and curious to understand if Synapse is also very limited, which will make PBI dataflow the best option for most use cases.
Hi Nikola, thanks for this content. On my scenario I need to let the user pick “Select All”, on this case the DAX functions like ISFILTERED or HASONEFILTER returns false instead of true. Any way to overcome this ? Thanks Doulglas
Hi Nikola,
incredible article and extremely helpful 🙂
I have a question. I have a power BI report running on a SQL dedicated pool. At the moment we only have column stored indexes. We have from one side some dashboards and KPIs where, where the column stored indexes are performing quite well. But from the other side, we have matrix with drill-down capabilities, where we notice the compression applied by the column stored index is not performing well.
As an improvement, I want to create materialized views in order to group the columns used in the matrices.
The question is, should I remove the column stored index and added a “normal” clustered index on the table, in order to get a better performance with the materialized view, and let the engine decides where to get the data?
The biggest fact table has currently 2.2 billion records.
BR.
Paul
Paritosh
You don’t need a disconnected table. I measure changing the filter context would be sufficient.
Something like sameperiodlastyear method.
Nikola
Thanks for your comment Paritosh. As for almost any problem, there are multiple different solutions, and disconnected table is just one of them:)
Dennes
Very Good, but also scary!
This is the kind of thing which can change on any minute, in a service pack or something like that, and many would not notice.
Nikola
Thanks for your comment Dennes. I agree with you, we should be careful and try to stay tuned for all adjustments.
Guy Glantser
Thanks Nikola!
I disagree with you. I think that DATETIME2 still has some important benefits, just like you mentioned – less storage, bigger precision, larger date range and being SQL standard compliant. I think that the only meaningful drawback of DATETIME2 is the bug in the cardinality estimator when associated with the DATEADD function. This is a big deal, but it can be avoided by using a DATETIME value in the relevant expression inside the DATEADD function, and then converting back to DATETIME2, just like Paul White suggested.
Nikola
Hi Guy,
Thanks for your comment.
I agree with your disagreement:)…I was also wholeheartedly on Datetime2 in the beginning, exactly because of all advantages you mentioned, but in the end I’m not sure which type is “better”. Hmmm, for larger date range and being SQL standard compliant, I don’t consider these as so significant advantages, since I really don’t see many scenarios when that can make big impact (of course, maybe I’m just assuming wrong). But, for bigger precision and memory consumption – fully agree with you, those are really important advantages, which one should seriously consider when making design decisions.
I guess, as in 99.9% of situations when it comes to SQL Server, the correct answer is: “It depends:)”
Thanks again for your opinion, really appreciate that.
Gudmundur Asmundsson
This is far better dimdate function that I have been using since 2017. It creates a custom function with date from and to plus u can specify the local language for days and months.
let CreateDateTable = (StartDate as date, EndDate as date, optional Culture as nullable text) as table =>
let
DayCount = Duration.Days(Duration.From(EndDate – StartDate)),
Source = List.Dates(StartDate,DayCount,#duration(1,0,0,0)),
TableFromList = Table.FromList(Source, Splitter.SplitByNothing()),
ChangedType = Table.TransformColumnTypes(TableFromList,{{“Column1”, type date}}),
RenamedColumns = Table.RenameColumns(ChangedType,{{“Column1”, “Date”}}),
InsertYear = Table.AddColumn(RenamedColumns, “Year”, each Date.Year([Date])),
InsertQuarter = Table.AddColumn(InsertYear, “QuarterOfYear”, each Date.QuarterOfYear([Date])),
InsertMonth = Table.AddColumn(InsertQuarter, “MonthOfYear”, each Date.Month([Date])),
InsertDay = Table.AddColumn(InsertMonth, “DayOfMonth”, each Date.Day([Date])),
InsertDayInt = Table.AddColumn(InsertDay, “DateInt”, each [Year] * 10000 + [MonthOfYear] * 100 + [DayOfMonth]),
InsertMonthName = Table.AddColumn(InsertDayInt, “MonthName”, each Date.ToText([Date], “MMMM”, Culture), type text),
InsertCalendarMonth = Table.AddColumn(InsertMonthName, “MonthInCalendar”, each (try(Text.Range([MonthName],0,3)) otherwise [MonthName]) & ” ” & Number.ToText([Year])),
InsertCalendarQtr = Table.AddColumn(InsertCalendarMonth, “QuarterInCalendar”, each “Q” & Number.ToText([QuarterOfYear]) & ” ” & Number.ToText([Year])),
InsertDayWeek = Table.AddColumn(InsertCalendarQtr, “DayInWeek”, each Date.DayOfWeek([Date])),
InsertDayName = Table.AddColumn(InsertDayWeek, “DayOfWeekName”, each Date.ToText([Date], “dddd”, Culture), type text),
InsertWeekEnding = Table.AddColumn(InsertDayName, “WeekEnding”, each Date.EndOfWeek([Date]), type date)
in
InsertWeekEnding
in
CreateDateTable
Nikola
Fantastic! Thanks for sharing Gudmundur!
hubert
About date … Could you please display publish date? It’s crucial for tech blogs!
Nikola
Sure, thanks for the suggestion.
Eitan Blumin
I think the bad statistics estimations should pretty much be a deal-breaker here…
Fred Kaffenberger
what about CDM, Python, R?
Nikola
Hi Fred,
Personally, I’ve had maybe one question with R, no Python/CDM at all. Which doesn’t imply that it can’t happen that someone else had it.
Fred Kaffenberger
thanks! I’m reviewing the official training now, so I don’t expect too many difficulties
Carmine
i’m curious then… how is productive to store/display date in Powerbi? Can you give a hint on that?
Nikola
Hi Carmine,
If you don’t need the time part, then store it as a Date only, since there will be less distinct values and therefore data compression will be better and your data model size will be lower.
In any case, don’t forget to change the data type in Power Query Editor, BEFORE you load data into your report. Once you load it to a report, you can just change the formatting to exclude time part, but underlying data model will still keep the time part.
Hope that this helps.
Vivek Ranjan
Great article. Thanks!
I have a question: How can we get the column header name instead?
Nikola
You’re welcome. Sorry, but I’m not sure that I understood your question: column header name instead of what?
Sérgio. Silva
Great articule. Kind regards from Portugal.
Nikola
Thanks Sergio, much appreciated! Greetings to you from Austria:)
Ray Ludeña
Thank Sergio.
Very interesting this article.
I have a question. These recommendations also could use in other visualizator like Qlik, right?
Regards from Perú
Nikola
Hi Ray,
Some of these rules are generally applicable (like, for example, eliminating columns/rows you don’t need), but I really don’t have any experience with Qlik, so I don’t know how it stores or compresses the data. The example from my article relates to the Tabular model, which runs in the background of SSAS Tabular, Power BI, and Power Pivot for Excel.
Baz
You could always group the slicers with two small shapes at either end of your panel to create the width and height you want. Go to the selection pane hide the shapes so they are not visible to the end user. You can then add a background to the group. No bookmark needed.
Nikola
That’s also an awesome trick! Thanks for sharing Baz.
Wederley
Very interesting and useful article! Thanks for sharing!
Greetings from Brazil.
Nikola
Thanks Wederley, I’m glad that you find it useful! Greetings from Austria.
Srecko
Dobar i korisan clanak Nikola!
Nikola
Hvala Srecko
John K
Nice tip. Now to push for more.
How can you have CONCATENATE X to substitute “and” for the last comma? Instead on getting “Apples, Bananas, Pears”, you get “Apples, Bananas and Pears”.
Consultant Power BI
You have to use SUBSTITUTE with the last parameters, [InstanceNumber] being equal to DISTINCTCOUNT of the given column minus 1 (this means the last position in the serie). This is an example :
Selected Brands = SUBSTITUTE( [YOUR_CONCATENATE_FORMULA], “,” , ” and”, DISTINCTCOUNT(‘Product'[BrandName]) – 1 )
mobile legends hacks
You really make it seem really easy with your presentation however I to find this topic to be really
something which I believe I might never understand. It seems too complicated and very huge for me.
I am looking ahead for your next submit, I’ll attempt
to get the grasp of it!
Nikola
Thanks! It’s really not that hard, just try to understand general data modeling principles (Kimball methodology): After that, I’m sure you will get to grasp with specific related topics.
mobile legends cheat
Hmm it seems like your blog ate my first comment (it was
extremely long) so I guess I’ll just sum it up what I wrote and say, I’m thoroughly enjoying your blog.
I too am an aspiring blog writer but I’m still new to everything.
Do you have any helpful hints for newbie blog writers?
I’d really appreciate it.
mobile legends hack
This is very fascinating, You’re a very skilled blogger. I have joined your rss feed and
sit up for in quest of extra of your excellent post.
Additionally, I have shared your site in my social networks
mobile legends hack
Pretty! This was a really wonderful article. Thanks for providing this info.
Bharath
Nice article. Agree with the way you mentioned here. What happens when user says they want the same conditional formatting in the exported Excel as well??
Nikola
Good question! Unfortunately, at this point in time, conditional formatting doesn’t work when exporting data from the Power BI report to Excel.
Ben Connelly
Yeah, unless MS finds a brilliant way to unify how conditional formatting works and is written in both pieces of software, it seems like that may never happen unfortunately. Even the Export to Excel feature is only squeezing out the data only… Excel wouldn’t be able to read how PBI encodes its calculation engine.
Don’t get me wrong, Excel has through Power Pivot the wherewithal to understand all that, but it doesn’t appear MS is interested in wiring all these similar systems together. I hope I’m wrong. I hope they come out with Excel 2022 and say: “Hey, we’re merging Excel and Power BI’s differing functionalities into one… Power Excel BI!!!”
Nikola
Completely agree with you, Ben. Let’s hope that MS hears our prayers before 2022:)!
By the way, Power Excel BI – sounds awesome:)!
Lee
I’ve been using bookmarks as navigational tools on paginated reports but never considered using books marks in place of static tooltips. This is definitely going to raise my visual game, thank you!
Nikola
You’re welcome, I’m glad if you find it helpful!
Carlos Santos
Very nice and sinple way to improve the report performance!
Nikola
Thanks Carlos, I’m glad if you find it useful
Ferdinand Santos
Hi Nikola,
Thank you! I will be using this approach in my upcoming project.
Comment on the following;
“Before we conclude, one important remark: in order for this trick to work in Power BI Service, you need to switch the function in My Tasks measure, and use USERPRINCIPALNAME() instead of USERNAME():”
Shouldn’t it say “use User Service instead of User Local”?
Best regards,
Ferdinand
Nikola
Hey Ferdinand,
Thanks for spotting this, you are completely right! Just updated in the article:)
Thanks again!
Nikola
Rhonda
I love this! Great example of a real world scenario, and gives me ideas to get away from the boring tables!
Nikola
Thanks Rhonda, I’m glad if you find the article useful!
Roland Szirmai
Hi Nikola,
I was just exploring this option last week, and I’m so happy that GITC pointed me here! Really great example – and more importantly I can copy-paste this 🙂 “I’m not lazy, just efficient”, right?
One thing, which in my case helps a bit more is the below (sorry for the formatting in advance):
My Customers =
VAR user = USERNAME()
VAR uName = LOOKUPVALUE(‘User Directory'[Surname],’User Directory'[E-mail],user)
RETURN
CALCULATE(
COUNTROWS(Customers),
//KEEPFILTERS(
CONTAINSSTRING( Customers[Sales Rep Search], uName)
)
The reason behind that is we can have multiple SalesReps linked to a single Customer. Of course, this makes it a bit more challenging to apply page level filters (or maybe I’m doing something wrong), but can apply this filter on each visual on the page.
With the bookmarks and buttons of course!
Again, great stuff, and thanks!
Regards,
Roland
Nikola
Hey Roland,
Thanks for expanding on this, really great use-case, I like it! And, yes, “we are not lazy, just efficient”😁😎
Best,
Nikola
Steve
Great info. Looking forward to your other Power BI 101 topics/posts.
Nikola
Thanks Steve! The next article is already published: https://data-mozart.com/power-bi-data-shaping-in-a-nutshell/
Stay tuned for the rest of the series:)
Weekly Data Round-up 01
[…] Slicing Data with Images in Power BI – Nikola Ilic (DataMozart) has blogged about how to use images for slicing data in Power BI reports here. […]
Kay
Great summary! Love this. If I’ll get asked what the differences are, I’ll point them to this post.
Nikola
Thanks Kay!
camo phone case
Your mode of describing all in this paragraph is truly good, all be capable of
without difficulty know it, Thanks a lot.
Nikola
Glad to hear that!
autenticos y exclusivos
Great article post.Thanks Again. Will read on…
Nikola
You’re welcome!
we just did shirt
These are actually enormous ideas in on the topic of blogging.
You have touched some nice factors here. Any way keep up wrinting.
Nikola
Thank you, I’m glad if you find these questions inspiring
Triparna Ray
Thanks for sharing useful insights on this important topic. With regard to guid if we do not have option to change source structure we have to bring in guid and to improve performance trying to add index column and join based on that. What we found was this caused query folding not to work any further. Any thoughts?
Nikola
Hi Triparna,
As far as I know, there is no proper way to optimize GUID performance once you import it into the Power BI. Query folding will not work in cases when your Power Query transformations can’t be “translated” to a source language (mostly SQL).
Sohail Imtiaz
Nice article. Thanks for sharing this.
Nikola
You’re welcome, glad that you liked it!
Kishore M
good
Nikola
Thank you!
Bharat
Wow amazing article must know for Power Bi developer.
Nikola
Thank you Bharat!
Anas Farah
Love this series of posts!
I actually have a quick question/potential article topic. I was working on a report this week and I wanted to create a 24 hour moving average measure and I realized a lot of the time intelligence function seem to only go down to the hour granularity. So I was wondering how do you handle hourly data in Power BI? I didn’t find a lot of resources online on how to tackle that.
Thanks!
Anas
Nikola
Hi Anas,
Good question! I believe you’ve wanted to say that a lot of the time intelligence functions go down to the DAY level of granularity, instead of HOUR:)…I think you can’t use built-in time intelligence functions for hour calculations, but all of these built-in functions can be rewritten in “normal” DAX. Honestly, I don’t know from the top of my head how to handle this request, but you are right – it’s a good idea for one of the next articles:)
Hitchhikers guide to Azure Purview for Power BI professionals - vcunews
[…] should have noticed that Microsoft made two important announcements a few weeks ago: the first, Azure Synapse Analytics became GA, while the other also caused a lot of interest: by introducing Azure Purview as the […]
Small Multiples in Power BI – Curated SQL
[…] report performance even if they aren’t displaying any data; several other people have written similar posts too. Small multiples are just another way you can replace several visuals with a single visual […]
Daniel Clark
Have you tested the performance on this?
Nikola
Hi Daniel,
Honestly, I haven’t tested performance, because my table was really small (<1000 records).
Peter Hui
This is still pretty amazing, could have saved me so much time !
Nikola
Thanks Peter, I’m glad to hear that you find it useful!
Bob Blackburn
Thanks for a ton of great content. Most people don’t realize how long technical posts take.
Nikola
You are welcome, Bob!
Chandru Sugunan
Hi Nikola, Correction to the third screenshot from the top(Configuring Power BI tenant). The red box is pointing to incorrect feature switch.
The correct one is documented in our official Purview documentation page. https://docs.microsoft.com/en-us/azure/purview/register-scan-power-bi-tenant#associate-the-security-group-with-the-tenant
Nikola
Thanks for spotting it Chandru! It’s been corrected:)
Ayobami Adeleke
Nice one, please can you leave a link to the pbix here?
Nikola
Hi Ayobami,
Unfortunately, I can’t, as it contains real data that is not for sharing outside of the company. Sorry about that!
Power BI: Data Model Optimization – Data on Wheels – Steve & Kristyna Hughes
[…] External Resources:https://data-mozart.com/inside-vertipaq-compress-for-success/https://github.com/AnytsirkGTZ/TimeTable_MCode/blob/main/MQuery%20Time […]
Power BI: Data Model Optimization from Blog Posts – SQLServerCentral - The web development company
[…] External Resources:https://data-mozart.com/inside-vertipaq-compress-for-success/https://github.com/AnytsirkGTZ/TimeTable_MCode/blob/main/MQuery%20Time%5B1%5D%5B2%5D […]
NEil Rodrigues
simple but super useful.. Thank you. this is a great blog. the synapse bit is of less use to me, the development and report building is certainly something i enjoyed reading
Nikola
That’s great to hear Neil! Cheers from Austria!
Nikola
Thanks Jolie, great to hear that!
Power BI: Making Date & Time Keys – Data on Wheels – Steve & Kristyna Hughes
[…] External Resources:https://data-mozart.com/inside-vertipaq-compress-for-success/https://github.com/AnytsirkGTZ/TimeTable_MCode/blob/main/MQuery%20Time […]
Jerry Brown
This is good news, and hard to find resources on this for PBI Report Server, so thank you! I have a question about where you enter the query. I need to set this up for my team but I’m missing information on how to access these logs. Do you enter the queries you show from inside Power BI Desktop (for server), or do you sign on from a url, or maybe MSSMS Report Server Configuration Manager?
Nico Botes
Awesome, indeed! Use it all the time… I see on your sample Report, you have a Year, and Month Slicer, each being a dropdown… I use the same, but additionally, I have a Date-Range slicer, that allows for getting to a custom date-range, or maybe specific days…
I have seen that when I use such, as your Year and Month dropdown filters…and then changing to a specific day, then going back to changing the Year or Month, only the Start-Date of the Date-Range slicer updates – I’ve checked “Edit Interactions”, so that should allow for perfect interaction between the different Date slicers…which all comes of my Date-Dimension table, with hierarchies… I’m just wondering, if it is a bug in Power BI with the slicer itself…it’s not a showstopper, but annoying…a quick reset filters bookmark, or even the clear button on top-right of filter sorts it out…it’s just not the best user experience alltoghether…
Weekly Data Round-up 06 – 18/12/2020
[…] Extend Power BI theme with an image! – Nikola Ilic (@DataMozart), MCT, has blogged a really great trick to add images to a Power BI theme to automate adding images to Power BI report pages. Read more. […]
Les Actus Power Platform du 05/02/21 - Neoreeds
[…] 📊 Power BI : Bien démarrer avec Azure Purview pour Power BIhttps://data-mozart.com/hitchhikers-guide-to-azure-purview-for-power-bi-professionals/ […]
Nico Botes
Hello, have you ever encountered this whilst running a “job” in Azure Analysis Services?
Nikola
Hi Nico,
Honestly, not in AAS, as I’m using on-prem SSAS only.
Haradev
Hi Nikola,
I am preparing for more than a month and relying on the practice tests. Can you suggest any other things apart from Microsoft’s online training material? Do you think the MS official practice test is useful? I already finished Steven Parker’s videos after taking your suggestion and it was helpful. Thanks
Nikola
Hi,
I took the exam while it was still in beta, so there were not many resources out there (even MS Learn path came later). I don’t have experience with the MS official practice test, so I can’t say if it’s useful – probably yes, but you should maybe check that with someone who tried it.
I think Parker’s training is excellent for someone trying to complete DA-100, and I believe you will do great! Good luck with your exam!
Bibin Bastian
Very neat technique
Erik
Very interesting blogpost!
I am now getting curious if it is possible to choose both dimensions and put them both in a different column so the data will be grouped by both chosen dimensions.
Derek
It can be from different sources or excel, as long as you put a dataflow (plus enhanced compute engine) in between 🙂
Nikola
Hah, that’s cheat😉!
Marcus Wegener
Hi Nikola,
if you merge two queries and expand the columns, it should bring back the query folding.
https://www.thebiccountant.com/2019/04/18/query-folding-joinkind-inner-gotcha-power-bi-power-query/
Nikola
Oh, man! Thanks for the tip Marcus, didn’t know that:(…Is that a bug, or what?
Marcus Wegener
It never a bug, it’s always a feature 😉
Nikola
🤣🤣
CHRISTOPHER MAJKA
Great blog post Nikola. There is also a Github that Alex has posted with M functions that fold, etc. https://github.com/itsnotaboutthecell/powerquerym/blob/master/QueryFolding.md
Nikola
Thanks Christopher. And special thanks for providing a link to the Github repo that Alex created for PQ related stuff!
How to Capture SQL Queries Generated by Power BI - H4Host.com - Latest Web Hosting News
[…] already written why you should reconsider using DirectQuery, and in which scenarios (maybe) it makes sense to go that […]
Nathan Watkins
As you might know, query folding will occur with Table.TransformColumns() if you use Date.From instead of DateTime.Date. Great article nevertheless!
Nikola
Great point Nathan! Thanks for expanding on this
Nathan Watkins
As you might know, query folding will occur with Table.TransformColumns() if you use Date.From instead of DateTime.Date. Great article nevertheless!!
Creating a Timetable Fact Table from a Schedule - Shetland Data
[…] rows will be 500 times the number of scheduled activities. Still a tool like Power BI should still make quick work of this. Again this query uses the prior as the […]
Ionut Danciu
Thanks Nikola for sharing, this is great stuff!
Only one thing is puzzling me: your DirectQuery “cold” data is an Azure AS tabular model?
Thanks,
Ionut
Nikola
Thank you Ionut. Direct Query “cold” data is in SQL Server database, as all client’s legacy technologies are on-prem (SQL Server and SSAS MD).
Ioan Danciu
Thanks for your fast reply!
This means you use your SQL Server on-prem database twice: as “hot” and “cold” data, right?
Nikola
“Hot” data is loaded into Power BI (in the aggregated table), but yes, it “lives” also in on-prem SQL Server. Essentially, it is physically just one big fact table in the SQL Server database, and it is split into two for Power BI – “hot” data goes in Import mode, “cold” data stays in SQL Server. Aggregated tables over “cold” data are imported into Power BI too, as they are not memory exhaustive, and satisfy 95% of the reporting requests.
Ionut Danciu
Thanks again Nikola for the detailed explanation – I understood now properly.
I was surprised at first glance as I thought you used Direct Query directly against the on-prem SSAS MD, which is at this time not supported (as I know).
Another question: are there features you used here in preview mode, or all included in GA? Or, in other words: is this approach safe to be used on a productive customer environment?
Nikola
No problem at all!
You’re right, Direct Query over SSAS MD is not supported (only Live Connection). Regarding your question: all the features are GA, so you can freely use them in a production environment.
Cheers!
Frank Tonsen
Nice tutorial. One question.
Why did you build an agg table based on factBetSummaryHot?
Isn’t it sufficient to import the hot data directly instead?
Nikola
Thanks Frank!
You’re right, you can also import “hot” data “as-it-is”, without aggregation, but it will then consume more memory.
General Ledger
It seems a screenshot is missing at Based on this value, I will determine if the product belongs to High, Medium or Low group. To achieve that, SWITCH function comes in handy:
Nikola
Oh, good catch! Thanks! It’s been fixed now:)
Marcus Wegener
Nice blog post Nikola! Just to be correct, you can use implicit measures in multiple different aggregation types.
You just have to drag the column several times into the value area and assign different aggragations to it.
Love greetings
Marcus
Nikola
You’re right Marcus, thanks for pointing this out! I’ve corrected this part of the article👍
Jessica Jolly
Great article! I was just talking about explicit and implicit measures to my students. Now I have an excellent link to share with them.
Nikola
Thanks Jessica, I’m glad if you find the article helpful!
Marius
Or you could just use calculation groups in TE!
Thats a game changer 🙂
Nikola
Great point Marius, thanks!
Denisse
Hello, when you remove the columns or reduce the rows, do you do it at the source or in Power Query?
Thank you,
Denisse
Nikola
Hi Denisse,
You can do it at both places. When I write a native SQL query to import the data, I’m removing unnecessary rows and columns on the source side, to preserve the query folding afterward. While, when I import the table in a “regular” way, I’m filtering data in the Power Query editor.
Hope this helps.
Jon
Love it.
Can you make a blog post on how to calculate Mode (most frequent values) in different granular detail? Such as by order, by product, by category, by the whole, assuming the order qty is whole number. Each order can have multiple lines for the same product or different products.
taras koniukhov
Thanks Nikola for sharing, very inspiring!
Nikola
You’re welcome Taras!
Balde
Hello Nikola,
Thanks for sharing.
I have one question, I can see you have many id columns in your tables. What If you just « deleted » those unused column Id to reduce the model size ?
How do you proceed If you want to calculate a year to date ?
Regards,
Nikola
Hi Balde,
Great point! For this demo, I’ve removed multiple dimension tables from the model, but forgot to remove the foreign keys (ID columns) from the fact table. As you rightly said, that would reduce the model size even more.
Regarding YTD calculation – that should not be a problem until you are using the measure that hits the imported or aggregated tables.
Cheers,
Nikola
Steffen
Hello Nikola,
very cool tips! I tried #3. Do you know also a script for formating calculated columns?
Thank you
Steffen
Vinay B N
Wow! This is mind-blowing.
Bibin
Thanks for the detailed explanation
christopher majka
Looking forward to this series of posts – very relevant for any Power BI user out there!
Nikola
Thanks Christopher, glad to hear that you find the article useful! Hope you’ll enjoy the rest of the series:)
Rolly
Great article out there! . Is there any documentation as to how to create a star schema with multiple fact tables?
Nikola
Hey Rolly,
Patrick from Guy in a Cube has a great video on how to handle multiple fact tables in Power BI. Here is the link: https://www.youtube.com/watch?v=TnyRsO4NJPc
Hope this helps.
ARJUN DN
Nice one Nicola…
CHRISTOPHER MAJKA
Nicely done! Will definitely be able to use this technique in our reporting. Thanks
Nikola
Thanks Christopher, great to hear that!
Building an End-To-End Analytic solution in Power BI: Part 5 — Action, please! – Kamran Mahmoudi
[…] over a specific visual, the respective action should be displayed! I’ve already written how to enhance your report using tooltip pages, and here we will follow a similar […]
Ilya Rzhevskiy
Wow, I didn’t know this. However in Data Science its always advisable to clean up tables, replacing Blanks with O’s. Because that could enable more complex algorithm calculations later on. So it depends on your usage of data.
Nikola
Yes, it depends on the use case, but it’s good to be aware of the possible consequences.
Prince Kher
In an ideal reporting solution Data Marts for the specific period should be in place. To be honest on performance based Table will def have better performance if you can load the whole view data rather than joining but if its a timely refresh of data we can always refresh at a non peak time of the report.
Data Connectors - SQL Database - BI-Lingual Analytics
[…] What is Query Folding: https://data-mozart.com/what-is-a-query-folding-in-power-bi-and-why-should-i-care/ […]
Georgi
Great post, Nikola! I am playing with this new visual now to build a case for one of my customers and everything looks perfect until I publish the report to the service. Then it says that the report couldn’t access the data source and it doesn’t render. Have you stumbled upon such issue? I am thinking it might be because the source for my paginated report is an SSAS tabular database but then again, it doesn’t make sense to work in PBI Desktop and fail in the service.
Nikola
Thank Georgi! Are you using Live Connection, or Import mode?
Anil
Thank you so much for the wonderful explanation about BI. I am curious to learn thoroughly. How can I get the next parts, please. Regards
Nichole Minnick
Hello Nikola, this is amazing! I have a couple of charts that will greatly benefit from this. However, when I try to apply the logic to a data point that is not a measure, it doesn’t work. Is there something I can do to convert the data to a measure or a different dax code?
Sushanth B
As you said views are stored query upon multiple tables that execute upon calling the view, ok then what will happen when a view is joined to another table to get a result set? Will the stored query need to execute for each record in the join with the table?
Is this a best practice? Or what Is the best practice in this situation?
Select * from tableA a join viewA b
On a.id=b.id
Will viewA stored query will execute for every join records from tableA???
Please clear my question…
Nikola
Hi Sushanth,
Here is a very good article explaining considerations in the scenario you mentioned: https://codingsight.com/top-3-tips-you-need-to-know-to-write-faster-sql-views/
TWIL: November 14, 2021 – André Vala
[…] 3 reasons to use views instead of tables in Power BI!In recent months, I was frequently asked if one should use tables or views when IMPORTING data in Power BI. I’ve intentionally marked word importing, as the main purpose of this article is to compare usage of views vs tables in Import mode. […]
Raju kumar
To much informative.
Alexandre Doyen
Thanks this is amazing! So concise. Much better than Microsoft lessons =)
Last Week Reading (2021-11-28) | SQL Player
[…] Power BI Performance Tuning Cheatsheet! […]
bschobs
This is great!!!
Vishwajeet
In a dataset if there are parameters created using Manage Parameters option in edit query. How can we add or utilize those Parameters to our Native query?
Wouter De Raeve
Can partitions come from different tables?
Eg your import partition is one table and the DQ one is a different table (but they both have near-identical structures)?
If you then set up your ETL so your DQ-table is always the differential vs the import table, the scan to the DQ-table is not that costly…
Just don’t know if that is technically possible….
Nikola
Hi Wouter,
The approach you are proposing is possible and it’s called the Composite model. Basically, you keep one table in DQ mode, while the other(s) are in import mode.
More details here: https://docs.microsoft.com/en-us/power-bi/transform-model/desktop-composite-models
With Hybrid tables, you are simply splitting DQ and Import data within one same table. Hope this helps.
Ron
Hi Nikola, happy you write this article!! Trying to understand. So:
– scenario 1 is where the direct query data is recent/hot and the imported data is cold
– scenario 2 is where the imported, aggregated data is recent/hot and the directquery data is cold?
What if you have a composite model, that is only refreshed at the beginning of day (so no real-time data) whereby the requirement is that I want to have insight in aggregated (import) and very recent detailed (direct query) data ?
Is it then better just to stick to the composite model?
regards
Ron
Nikola
Hi Ron,
Well, the only correct answer is “it depends”:)
If you’re using a composite model where tables are with different granularity (I’m assuming the import data is aggregated per year/month/product/whatever…and DQ table goes to the lowest level of granularity), then I’d say you should stick with composite model. Because, in the end, a Hybrid table is just a table: so, you can’t define different granularities within one single table.
Paul Olding
Nice piece.
I find this a really interesting question and have come to the opposite conclusion – go with 2 actors as the default. It seems this puts me out of step with most experts on the topic but my reasoning is all about ease of use for self-service. I find that a big benefit of 2 actors is you can name the columns: Bet Placed Month, Bet Processed Year, etc etc. Much clearer for the end user to build a visual, and much clearer for the report consumer to have a visual with Title and axis labelled unambiguously.
To me the extra training of end users comes with the 1 actor approach. Having to explain that if you want to slice by date you have to pick the right measure, but you can choose any measure if you want to slice by product is what confuses. Also, ending up with a visual that, for example, counts bets by product that then mentions (Date Processed) in the Title when the date is totally irrelevant leads to confusion too.
Nikola
No issues with the “2 actors” approach at all! It just depends on a specific use case, and evaluating all pros and cons that come with any of these two solutions in that specific case.
Kamil Kaleciński
Great Article, thanks for clarifying how 2nd Scenario should be configured in practice, as I immediately imagined this version while learning about Hybrid table preview. I only have one question – Tabular Editor was raising an error until I set an import partition Data View to Sample mode. Do you know why it was needed and if there are any consequences related?
Kamil Kaleciński
Doesn`t matter, just found that the problem was caused by an incorrect Compatibility Mode. When I switched from 1500 to 1565, it is all working as expected, so I am just leaving a hint to the others. Great feature!
Kirill Perian
Thanks for posting this, Nikola! The requirement of slicers reacting to other slicers is not new and the common go-to solution for new Power BI developers is to accomplish it with bi-directional relationships. Somehow, the slicer filtering by non-blank measure results isn’t yet widely popular, so thank you for bringing it to everyone’s attention.
Nikola
Thanks Kirill! You’re absolutely right, I saw this multiple times, and it was almost always bi-directional relationship included
Martina
Hi,
we are trying to apply scenario #2 on our power bi data model. I would like to ask what is/should be the settings of Data view for each partition.. Because we created 2 partitions – import with data view Full and DirectQuery with data view Sample (as Full or Default is not posible to choose). After finishing the full process of appropriate table in SSMS, only data from DirectQuery are displayed in report/excel. Can it be caused by partition setting on Sample data view? Or do you have any idea where could be the problem? Thanks.
Krystian Sakowski
you shouldn’t set up any data view settings.
that feature is deprecreated & discontinued for powerbi.
did you upgrade compatibility level of your dataset to 1565 like Kay did in his article?
https://powerbi.microsoft.com/en-us/blog/announcing-public-preview-of-hybrid-tables-in-power-bi-premium/
w/o doing that indeed you can observe some legacy errors mentioning data view settings.
Tom
A lot of drama in this article. lol
Don’t get me wrong, thanks to the article, and made me learn when to use the new DAX: ‘CROSSFILTER’.
Thank you, Nikola.
Hans Peeters
Hi Nicola,
Question I tried Page Refresh with DirectQuery to a SSAS cube on Prem; But this doesn’t give me the desired result. I think this is not yet supported and what you show only DirectQuery to a database is supported. Is this correct?
To test this; i have created a PBI report, connected it to a SSAS cube; set the page refresh to 1 second; uploaded it to PBI cloud; Opened dax studio and turned the profiler on; If I open the report this registers my queries; but I can’t see the queries from the page refresh; and if I update the cube on the background there is no update in my PBI report.
Leveraging Hybrid Tables to get real-time data updates
[…] DirectQuery (uses no memory for storage but it showers and limits many reporting features) […]
Michael
“I’ll now duplicate my original flat table 4 times”
Why is it a duplicate and not a link? (I hope the translation is correct)
Why for each of these requests to repeat all the previous steps?
Nikola
Hey Michael,
Thanks for your comment. You can also use Reference (I guess you thought Reference when you said link), but my preferred choice is almost always Duplicate. Let me explain why:
Duplicate creates a brand new copy with all the existing steps and a new copy is completely independent of the original query. You can freely make changes in both the original and new query, and they will NOT affect each other. On the other side, Reference is a new copy with one single step: retrieving the data from the original query. If you change the original query, let’s say, remove the column, the new query will not have it anymore.
Michael
If you use a Reference, then there are problems with the preview.
When the first table is the main table, you cannot see the columns of the referencing query and accordingly select them for the join. You can only do the opposite, but it makes no sense.
I hope you understand, I’m not an English speaker)
Nikola
All clear, thanks:)
Michael
yea, i already figured it out…
Khaled Chowdhury
I love this!!
Cardinality — Your Power BI Majesty! – Data Science Austria
[…] optimal condition — which means, reduce the data model size whenever possible! This will enable VertiPaq’s Storage Engine to work in a more efficient way when retrieving the data for your […]
Ambaji
This is excellent. Till now I was in an assumption that Vertipaq plays role in both Import and Direct Query, but with this I got clarity on the Row mode storage and Columnar storage too. Thanks for the best one.
Wale
This is fantastic. Had no idea you could do that gif thingy.
Bharat K
Hi Nicola,
This is wonderful article, I am currently Using SSAS import mode to get 10 million rows of data and it’s super slow. Is there is any suggestion or blogs you have on the site to refer to in order to deal with long time loading queries, for now I have splitted data in multi queries set using SQL LAG , and then appending all table as new query.
Thank you , Bharat
Nikola
Hi Bharat,
It’s hard to answer without knowing more detail. Generally speaking, if you’re facing issues with slow data refresh process, I’d strongly recommend checking Phil Seamark’s blog about visualizing the data refresh process. Then, you can identify which part (or which query) of the process is the bottleneck and check if something can be done to optimize it.
https://dax.tips/2021/02/15/visualise-your-power-bi-refresh/
Cheers,
Nikola
Wasiu
Good
Fernando Calero
Hi Nikola,
Great article, thank you. One comment, I see you don’t include the publishing date of your blog post, do you see that better than including it?
Regards,
Fernando
Nikola
Hey Fernando,
Thanks for your feedback! Actually, the original published date is included on the main page, while I keep the “Updated date” within the article itself (at the bottom), as it sometimes happens that I apply modifications to the article.
Philip Robinson
Hi Nikola, this article (and all the others in the DP-500 series) is really helpful. Just one question if that’s ok, why is the [factonlinesales] table in direct query mode? Can you end up in a scenario where the data in [factonlinesales] table and [Sales Product Agg] don’t balance as the direct query tables have the latest data and the import tables might not have been refreshed. Thanks Phil
Nikola
Hi Phil,
Thanks! For User-defined aggregations in Power BI to work, the detailed table (which is FactOnlineSales in our example) MUST be in DirectQuery storage mode. If you want a “one table fits all” solution, then I’d recommend taking a look at the Hybrid tables feature, which combines DirectQuery and Import within one single table! I wrote about this feature here: https://data-mozart.com/hybrid-tables-in-power-bi-the-ultimate-guide/
Hope this helps:)
Philip Robinson
Cheers Nikola!
Sreedhar Vankayala
I will ask you to add, if it makes sense
Data Refresh
– Query Dependencies
Level of dependencies. I have seen queries with more than 4 levels of depth cause time to refresh
– Size.
Most of the time we don’t need all data, but we bring in those needed.
Data Model
– Data types
– Role Playing Date Tables or other Dimensions.
Note: Date tables are getting internally created if we don’t have date tables and linked to them.
– Relationships.
Some unwanted relationships can be deleted.
Some may need relationships
– Data formats and default summarize by
– Hide columns that are not needed (Hide in report view)
DAX
– Check the DAX generated using Performance Analyzer. (You placed Stop or Red icon not sure whether you meant to check the queries or ignore it.)
– Sync slicers
– Bookmarks
Visuals
– Pay attention to number of elements displayer or retrieved
External Tools
– ALM Toolkit
– Model Documenter
– Dax Studio (You placed Stop or Red icon not sure whether you meant to check the queries or ignore it.)
– Tabular Editor
We have lot more to consider. I gave those within your limit of diagram focus and these will enhance by adding to the diagram
Thanks
Sreedhar Vankayala
Swayam
I tried to replicate through Sales & Stores Table . But I don’t get the circular dependency issue at all.
carmen
Thanks for the detailed explanation Nikola
Ajit
Great article as always. Thanks Nikola.
DWIJEN BARUAH
Nice one
Can you send me the PBIX file?
When I am trying to put it matrix from two slicers it is not working
2021 Week 33 | Power BI: Create Magic Tables – Workout Wednesday
[…] https://data-mozart.com/magic-tables-in-power-bi/ […]
Power BI Field Parameters and Slowly Changing Dimensions - SharewarePile.com - Latest Web Hosting News
[…] You can find more info about Power BI field parameters in the official blog post, but also here, here and here. The use case I’m talking about is slowly changing dimensions of Type 2, you know, the […]
Tingting
Hi, can I still apply for this free bootcamp? Thanks
Nikola
Hi,
Unfortunately, the application window for this edition is closed:(
Jessica Jolly
I have mixed feelings about allowing personalization. I don’t necessarily want to have to answer questions about changes that someone else has made. What are your thoughts about supporting personalizations?
Nikola
Thanks for your comment, Jessica. Generally, I like them and I think it’s an undervalued feature in Power BI. However, I understand your concerns. In the end, I guess it’s all about educating report consumers on how to use certain features and functionalities:)
Power BI XMLA Endpoints, Demystified | by Nikola Ilic | Jul, 2022 - Techno Blender
[…] there were many “game changers” — features that brought revolution instead of evolution! Composite models, Dataflows, AI-powered visuals, Hybrid tables, Bookmarks…You name […]
Sakasra vadamalai
First of all thanks for such a wonderful article. Will it suit the same hot and cold concept on databricks source?
Joyce Woo
Hi Nicola,
This is wonderful and thanks for sharing. Can I know is it unique dates on ‘Date UTC Agg’ table? How to make this table as ‘Many’ relationship to ‘One’ on DimDate table?
Regards,
Bob Dobalina
UNION ALL will return all records, UNION returns a unique set… think you might have the wrong way around
Hristo
My source SQL query contains a transaction isolation level modifier READ UNCOMMITED. With it, the query does not fold. Without it – it does fold. Do you have an idea why this is the case? I can find some sense in presuming that folding will lift more load off the server than the transaction isolation level modifier but still.
Ani
Hi
I have one query.
I created two partitions on my table hot data partition and cold data partition with import mode and direct query mode respectively. But when I filter the records from Hot data partition, it still hits the direct query. Does it mean that Import query mode is not working at all in this scenario?
Thanks,
Ani
Nikola
Hi Ani,
Even though you’re retrieving the data from the import mode table, the DirectQuery query will still be generated (even though it should return an empty result set). You can check the exact query in SQL Server Profiler.
A Hidden Little Gem that Can Save Your (Power BI) Life | by Nikola Ilic | Aug, 2022 - Techno Blender
[…] data, reduced the cardinality wherever was feasible, adjusted column data types, and applied some additional recommended practices…And, then, I remembered Benni’s tip and decided to give it a shot and see what happens if I […]
A Hidden Little Gem that Can Save Your (Power BI) Life | by Nikola Ilic | Aug, 2022 – Machine Learning News Hubb
[…] data, reduced the cardinality wherever was feasible, adjusted column data types, and applied some additional recommended practices…And, then, I remembered Benni’s tip and decided to give it a shot and see what happens if I […]
Tamir Basin
1. Why would a denormalized model be more efficient in terms of storage? I would except the opposite as I have redundant data
2. Why less joins reduce footprint in memory?
Nikola
Hi Tamir,
Good questions!
1. In the Tabular model, VertiPaq stores compressed data. So, if you don’t have many distinct values in the column (let’s say a column with 50.000 records and 20 distinct values), this column will be highly-compressed and therefore potentially consume way less memory than the relationship between two separate tables (relationships also have memory footprint). Of course, that’s not always the case, but potentially could be.
2. Please see my answer on 1. Relationships also consume memory in the data model, so getting rid of joins means you are also getting rid of these relationships.
Hope this helps.
Dritan Skarra
When you deleted duplicates you deleted them from CTE which is available only in the query.
Meanwhile duplicates remain in the SpecialOffer table.
Or maybe you just wanted to demonstrate that we can reference a CTE in a DELETE statement as well.
3 Steps to troubleshoot any DAX Query | by Nikola Ilic | Sep, 2022 - Techno Blender
[…] two key components of the underlying architecture — Formula engine and Storage engine — in this article, and I strongly encourage you to read it before proceeding further, as it will help you to get a […]
Olivier
Hi Nikola,
Thank you for your work, and all your articles. They are very clear and helpful.
My question is close to the previous one.
We always discuss Aggregation with both import and DQ tables. Is there any sens doing it with only import tables to reduce the granularity scanned on most measures but still being able to deep dive, while avoid multiple switch measure checking with table should be queried.
Nikola
Hi Olivier,
That’s a fair question. However, user-defined aggregations in Power BI work ONLY and only if the main table (fact table) is in DQ mode. Hope this helps.
Cheers,
Nikola
Tom
Great article!
Eirik Maalen Andersen
Hi Nikola, I have an issue with Power Apps visual refreshing when I navigate between different report pages. Do you think this will potenitally fix this issue? I want the Power Apps visual not to start the app over and over again…
Thanks! Eirik
EL-FIGHA
The content of the article remains too thin compared to an ambitious title like “Everything you need to know”
Would it be possible to update the article by including the following points:
-Scenario where a workspace is connected to an ADLS Gen2
-Consumption of dataflows, especially in a datamart
-incremental update
– access permissions, in particular permission to modify a dataflow
Nikola
Thanks for your comment.
These are all valid points, but it would probably be a chapter in the book if I covered them all in one article:)…I’ll cover some of these topics in the next articles.
G Harris
Does this only refresh when the the report is being viewed or will it still refresh when its not being viewed… ie. at 3am when no-one is working
DAX variables, virtual relationships, and iterators in Power BI | by Nikola Ilic | Dec, 2022 - Techno Blender
[…] in a role-playing dimension scenario. Unlike in one of the previous articles, where I explained how to handle role-playing dimensions using the USERELATIONSHIP() function to change the active relationship between the tables, I’ll […]
Ravi Raja
Amazing information Nicola. You are spreading the music very melodiously. Its Data to the ears ( like music to ears)
Nikola
Thank you very much Ravi!
Roshan
To where do you add the T SQL query (Insert Query) in the RDL report?
Suzan van Echtelt
Hi Nikola, thanks for all your interesting and great info, support, documents, trainings, etc
I find them very inspiring.
Have a great christmas time amd new year!
Suzan van Echtelt from the Netherlands
Nikola
Hi Suzan,
Thank you very much for your kind words, I sincerely appreciate it. Happy holidays to you and your family, enjoy the festive period.
Best,
Nikola
Jagadeesh
Created new dataflow using SQL server, need to refresh the dataset everytime after dataflow refresh or automatically the data will update on dataset?
In this case both dataflow and dataset schedule refresh is needed? even my dataset is only connected to dataflow?
Miguel
This was a super useful post! Thanks a lot Nikola!
Looking through the Window— Calculating customer lifetime value with new DAX functions | by Nikola Ilic | Jan, 2023 - Techno Blender
[…] functions extensively for various analytic tasks. I’ve described one of the possible use cases in this article, but there are literally dozen of scenarios that can be quickly and intuitively solved by using […]
Dev
It’s not a good practice to use views over tables if you are using as is data – regardless of whether you are using import or direct query. As you mentioned in the article, views are nothing but a stored query which is run every time data is called. This means database resource is used each time – and can cause substantial slowdown and cost increase which could totally have been avoided by using tables directly. I read your point on using views if the amount of columns / data is significantly less than what’s on the parent table – I agree with it, but it should be a very specific scenario. It’s almost always a better practice to use tables over views if you are using data as is.
– Dev (10+ years as BI Developer)
Nikola
Hi Dev,
Thanks for your comment.
I talked about using the data from the SINGLE table “as-is” (no joins, calculations, etc.). So, only omit unnecessary columns and/or change column names.
Let’s say that my view A returns 10 out of 15 columns from table B (5 columns are not necessary for reporting purposes). Why do you think that something like: SELECT col1,col2,col3…col10 FROM view A will be more inefficient than running SELECT col1,col2,col3…col10 FROM table B? There’s no difference in terms of performance, costs, etc.
Also, if you import the data to Power BI, why import table B with 15 columns and then remove 5 of them using Power Query, if you can clean up stuff in the database and import view A with only 10 necessary columns? Views, same as tables, support query folding, so no difference from that perspective as well.
Kliment
How did you calculate Average Days between orders ALL? I couldnt find it in the post.
Fantastic post!!!
Marina Ustinova
Thank you for the post!
However, some formulas do not work for me. Below is my calculation of the AVG days between orders. I have used your formula, but the error I am getting is: “The column [NUmber of days btw Orders] specified in the SUMMARIZE function was not found in the input table”. Because it’s a measure, it’s not a column, I guess. But how did it work in your formula?
AVG Days btw Orders = CALCULATE(AVERAGEX(
SUMMARIZE(
ALLSELECTED(Orders), Orders[Customer Name], Orders[Order Date],[NUmber of days btw Orders]),
WINDOW(1,ABS,-1,ABS, SUMMARIZE(ALLSELECTED(Orders),Orders[Customer Name],Orders[Order Date]),ORDERBY(Orders[Order Date]),KEEP,PARTITIONBY(Orders[Customer Name]))))
Nikola
Hi Marina,
Thanks for your comment.
Please check the version of the Power BI Desktop you are using. These functions are supported from the December 2022 version of PBI Desktop.
Hope this helps.
Daniel
Hello, I’ve noticed a lot of confusion regarding the location of the data on a dedicated SQL pool. I’ve read in some places that the data is kept inside the database of the dedicated SQL pool, but as you mentioned here and it states in the Microsoft documentation, the tables’ data are kept on a data storage outside of the dedicated pool. What are your thoughts on this?
Nikola
Hi Daniel,
As far as I know (and that’s also what Microsoft docs say), data is stored outside of the Dedicated pool, in Azure Storage.
fari
Awesome! It was a really helpful one.
Paulo Martins
Hi Nikola,
Thanks for the great article!!
Can you extend your comparison between Power Bi data flows and ADF wrangling data flow to consider Azure Synapse Data Flow?
I’m just surprised with the ADF limitations and curious to understand if Synapse is also very limited, which will make PBI dataflow the best option for most use cases.
Nikola
Hi Paulo,
Great suggestion! I’ll cover this in one of the next articles.
Best,
Nikola
Douglas de Brito
Hi Nikola, thanks for this content. On my scenario I need to let the user pick “Select All”, on this case the DAX functions like ISFILTERED or HASONEFILTER returns false instead of true. Any way to overcome this ? Thanks Doulglas
Paul Hernandez
Hi Nikola,
incredible article and extremely helpful 🙂
I have a question. I have a power BI report running on a SQL dedicated pool. At the moment we only have column stored indexes. We have from one side some dashboards and KPIs where, where the column stored indexes are performing quite well. But from the other side, we have matrix with drill-down capabilities, where we notice the compression applied by the column stored index is not performing well.
As an improvement, I want to create materialized views in order to group the columns used in the matrices.
The question is, should I remove the column stored index and added a “normal” clustered index on the table, in order to get a better performance with the materialized view, and let the engine decides where to get the data?
The biggest fact table has currently 2.2 billion records.
BR.
Paul
DRISS EL-FIGHA
…and CDM FOLDER FORMAT ?
Nikola
Hi Driss,
It’s Common Data Model. More details here: https://learn.microsoft.com/en-us/common-data-model/
Hope this helps.
Best,
Nikola
Inoussa
Nice, how to get your articles when you send them?