Book Review: Expert Data Modeling with Power BI (2nd Edition)

I’ve known Soheil Bakhshi for a while now and was keen to review the second edition of his new book: Expert Data Modeling with Power BI . Our friends at PackT sent me a review copy.

Data Modeling

The data modeling that’s needed for good Power BI outcomes is something that’s dear to my heart, and I was keen to see what Soheil had to say, and where I agreed or disagreed.

It’s great to see Soheil covering this content. I cannot tell you how often I now see people building Power BI reports over copies of data, directly from transactional systems, without taking the time to do data modeling work. That is almost always a mistake.

A good example right now is the apparent love affair with database mirroring in Fabric. I see people mirroring in their transactional system data, then exposing it to Power BI via DirectLake. Again, that’s a mistake.

Content

This is a really substantial book. Creating it will have been a huge undertaking. I was also glad to see that the sample code is available from a GitHub repository. One interesting approach is that the color images from the book are also available as a download. I can imagine that helping many readers, as it’s hard to achieve good outcomes with Power BI screenshots in a book format.

The book starts by describing the different layers involved. One area where I would differ from Soheil, is that I don’t assume that the data preparation layer will be in Power Query. In many cases, that doesn’t have the performance or flexibility that I’m after. For example, far too many processes in Power Query can be single-threaded. If I have a lot of data to prep, I prefer to do it in a database. Plus, while I love Power Query, I’ve never loved the fact that it’s case-sensitive. I’ve written about it before .

I don’t think that case-sensitivity has any place in an analytical system.

Soheil also provides some coverage of Power BI licensing options. I’m always nervous to see that in books, given the chance of it changing. And that already affects some of the content that covers this.

I like the way that Soheil has call outs that highlight good practices throughout the book.

Naming

Naming things is always a topical issue in all parts of computing, but with modeling, it’s even more important. In general, I have no issue with the naming conventions that Soheil has used, although there are some exceptions. For example, I like straight-forward human-readable names, and generally Soheil has done that, yet there are some exceptions. For example, instead of “IsValidDate”, I would use “Is Valid Date”. I keep space separated words in the names, usually everywhere except for keys, and certainly for anything exposed to a human.

I think it’s really important to avoid “computer-speak” or “IT-speak”, so I was glad that he didn’t have names like “DimCustomer” and “FactSales” all over the place. I think as soon as you do this, you’ve put a barrier between the end user and the system. Keep it all human. Instead of “DimCustomer”, I like to just see “Customer”. That’s what the human thinks of it as.

Similarly, I’m not a fan of putting an fn prefix on function names. I think functions should be named for what they do, not how they are implemented.

Another point I’d make is that I like to see consistency across singular vs plural names. In semantic data models, I’m a fan of singular names. So “Customer”, “Region”, etc. work for me but “Sales” doesn’t. For a start, it’s too generic and I’d rather see the table named for what a row in the table represents. Perhaps even “Sales Transaction”, or “Sales Order”.

I’m also wary of naming things in ways that overload existing system names. While there’s nothing wrong with “Currency” as a table name, per se, it is the name of a data type.

Other Modeling Areas

Overall I liked the coverage of dates, with a few exceptions. I’d like to see the option for a date table to be brought in from the source system emphasized even more. I’m not a fan of the CALENDAR() and CALENDARAUTO() functions in DAX.

I really liked the discussion about getting data from multiple source systems. This is another area that’s often ignored.

I think I would have omitted the discussion around Power BI Datamarts. I’ve never liked them and I can’t see they have any future at all. Users should be told to avoid them.

Given the current love of medallion architectures, it was good to see them explained in the book.

The transformation content covers the options in Power Query really well.

I liked the discussion on query folding.

I loved seeing this callout “It is advised not to use calculated tables for any data transformation activities. We always move the data transformation logic to the source system when possible.” Far too often I see Power BI developers using convoluted DAX and/or Power Query logic when they could have done the calculation much easier and faster in the source system. You can’t always do that, but you often can. The same applies to the use of calculated columns. We try to eliminate those as well.

I was glad to see the warnings around the use of many to many relationships. I see far too many of these. Almost always, they indicate an underlying modeling design issue.

Acronyms

One additional step that I’d like to see called out more, is the creation of a list of acceptable acronyms. Without this, I see systems where acronyms are applied inconsistently, and it’s an area that can easily confuse users.

For example, while MOM might seem obvious to some (i.e., month on month), it’s right at the edge of acronym complexity for me. Going further than that is treading into tricky territory. I’m not sure that most would get DOTIF all that easily (i.e, delivered on time in full).

Even if most existing employees at a company might be used to a specific acronym, the more obscure it is to a normal human, the harder it is for newcomers to get to know the systems.

Data Types

As it’s a book on modeling, an area that I would like to see described further is around data types. It’s an area where Power BI does not shine.

For example, even though it has a “decimal” data type, it’s not a true decimal type. It’s a “real” data type, and this can lead to all sorts of rounding issues. I have customers who just don’t understand why Power BI can’t add up numbers correctly. It’s ok for monetary values, as “currency” is a type of exact storage, but there are many other types of data that need precision and don’t involve money.

The book does cover the data types, but I’d like to see more warnings around this type of issue. It does describe the “decimal” data type as a “real” number. But I’m not sure that all readers would get that it also means “only approximate”.

Other Comments

In the section on monitoring, it would be good to call out that SQL Server Profiler is deprecated for use with the database engine, and that getting used to Extended Events might be a good idea.

I would like to have seen more discussion around incremental refresh. This is much harder to get right than most people assume.

In the RLS section, another option to consider would involve directly querying the Microsoft Entra ID related APIs, rather than periodically exporting JSON files.

I don’t love having a chapter at the end that describes new things. For me, they need to be sprinkled back through the book. As mentioned before, I’d also like to see the whole discussion on Power BI Datamarts in the final chapter removed. There is a lot of discussion on them in this chapter and they have no future.

Summary

Overall, I really enjoyed this book. Great work Soheil !

8 out of 10

2025-05-01