Book: Pro Power BI Architecture

One of my Kiwi buddies who specializes in Power BI is Reza Rad. I was pleased to see he had a set of eBooks now on Power BI but was especially happy to see he had a book called Pro Power BI Architecture.

There are lots of books around to discuss how to use Power BI but there's been a real lack of books on architecting solutions using Power BI. So if you want to learn to develop dashboards or reports, this isn't the book for you. Reza has other books for that.

I enjoyed reading the book and I liked the degree of coverage it gave to these topics.

If you are looking for ways to integrate Power BI into your solutions architecture, this book is a pretty good starting point.

What I was really hoping for though, was more info on administration. Mind you, the book doesn't claim to provide that. I keep getting asked about materials around administration issues. Perhaps that's another edition for Reza to consider. But the architects who need high level overviews of all the key topics should be pretty happy.

Bottom line?

I enjoyed reading this book, and it's a pretty easy read. Great for architects considering Power BI.

Greg's rating: 7 out of 10

Note: as an Amazon Associate I earn (a pittance) from qualifying purchases but whether or not I recommend a book is unrelated to this. One day it might just help cover some of my site costs. (But given the rate, that's not really likely anyway).

Bring your reports and analytics to life. Learn SQL Server spatial now.

I love working with SQL Server spatial. Don't just use boring tables and charts. Spatial data can really help to bring your reports and analytics to life.  Don't assume spatial is just about maps. (It's not). Don't assume you don't have spatial data. (I'm sure you have addresses, locations, etc.)

We've been working hard lately on getting many of our popular in-person courses, converted to being available online and on-demand. We're really proud to now have SQL Server Spatial for Developers and DBAs available.

We wanted to give you an experience that's as close as possible to a classroom course, but right in your own location, and at a time that suits you.

It's a detailed course. It's not just a set of videos. There are hands-on labs that you can do (we supply the data), quizzes that you can complete, and once you're even more awesome, you can get a certificate of completion.

It's $195 USD but best of all, use the coupon GOSPATIAL at the checkout, and you can enrol for $95 USD. (Until the end of January) Just click on the enrol button (says $195), and then click Add Coupon on the next page.

Check it out here: https://training.sqldownunder.com/p/sqlserver-spatial-data

And check out all our courses here: https://training.sqldownunder.com

More courses are on the way very soon.

Business Intelligence: Success is about small starts leading to bigger things

I spend a lot of time on client sites and time and again, one of the mistakes that I see people making, is trying to start with large projects. I think one of my all time favorite quotes about IT is:

Any successful large IT system used to be a successful small IT system.

The next time you're thinking about creating a project that's going to have a big bang outcome, please remember this. The history of the industry is that it really is likely to be a big bang, and not in terms of being a big success like you'd hoped for.

A staggeringly high percentage of large IT projects fail.

This is even more important in business intelligence projects. The industry is littered with companies that have spent a fortune on BI projects, and almost no-one in those companies can, or do, use the outcomes. It's a sad but true statistic.

Asking users what they want, gathering a large amount of detail, then building it, testing it, and delivering it to them sounds good, but it almost never works. Unfortunately, it's how many larger organizations think projects need to be managed. They assume that creating an IT project is like building a bridge.

It's not.

The first problem is that the users don't know what they want. Don't blame them later because you think they didn't give you the right instructions. That's your fault for assuming they can describe exactly what they want. In many cases, until they see some of the outcomes, they won't start to understand what the system will do, and until then, they won't realize what they really need.

Second, no matter how clearly you document their requirements, it won't be what they need. One of my Kiwi buddies Dave Dustin was having a bad day once, and I remember he mentioned that he was going to just spend the day delivering exactly what people asked for. That was wonderful and beautifully insightful, because we all know that it would lead to a disaster. It's little surprise. They might have said "sales" but they really meant "profit", and so on.

Finally, the larger the project, the longer it will take to deliver, and by then, even if you'd done it perfectly, the users' needs will have changed, so it still won't be what they want.

When you're starting a BI project, I'd implore you to find a project that has these characteristics:

  • Small enough to be done in a couple of weeks at most
  • Based on mocked up data in front of the target users
  • Large enough to make a significant difference to someone's life or work
  • Targeted at someone who's important enough to be a champion for the rest of what you want to achieve
  • (Ideally) targeted at someone "who pays the bills"

Start by doing a great job, and building on that, once you have support.

SQL: And where did the SSRS Private Assemblies folder move to in VS2017?

Recently, I posted about the templates folder for SQL Server Reporting Services designer moving in Visual Studio 2017. You'll find that blog post here.

I've also now found that the Private Assemblies folder has moved too. I do wish these locations would be fixed, shared, and known rather than needing to play hide and seek when updates are made to the SSRS designer tooling.

The Private Assemblies folder was used to hold a copy of any .NET assemblies that are referenced in your SSRS projects. By using .NET assemblies, you can share logic that's used in reports. It's worth noting that wherever we can, we try to avoid using .NET assemblies for this, and wish that SSRS had a way of dealing with common code inclusions, but that's a topic for another day.

Generally we only use code in reports for things like formatting functions, because we think that reports should render the data, not really calculate the data. (That should be done by a procedure or function on the server).

One of the reasons that we avoid using .NET assemblies for shared logic in reports is that when you do that, you buy yourself deployment issues. Where will the assembly (DLL) live? You basically have two choices:

Install it in the GAC (global assembly cache) – we don't like this one as it pollutes the GAC with application-specific logic.

Install it in both the Report Server's bin directory, and also in the private assemblies area for the SSRS designer. It has to go in two places as the designer needs to be able to find it during design time, and the report server needs to be able to find it at run time.

What has recently moved though, is where the private assemblies folder is located. Previously it was here:

\Program Files (x86)\Microsoft Visual Studio 14.0
\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject

Now it's either here:

\Program Files (x86)\Microsoft Visual Studio\2017
\SQL\Common7\IDE\CommonExtensions\Microsoft\SSRS\

Or here:

\Program Files (x86)\Microsoft Visual Studio\2017
\Enterprise\Common7\IDE\CommonExtensions\Microsoft\SSRS\

Like last time, I presume the difference is because of how the Visual Studio shell has been installed ie: either by SSDT (the SQL one), or by Visual Studio (the Enterprise one).

 

 

SQL: Where did my Reporting Services templates disappear to in VS2017?

I'm always telling students that one of the first things you need to learn to do when working with SQL Server Reporting Services (SSRS) is to create report templates.

Now I really wish that SSRS had a real templating system, but what we have at present is just a folder that contains the items that we can create. For example, in an SSRS project, if I right-click Reports, and click Add, then New Item:

I see a list of templates to choose from:

Now what I really want to see is something like MyCompany_A4_Portrait ie: I should already have company templates set up for paper size and orientation, etc. Usually I'd have a template that already has a logo, a place for parameters to be displayed, a report title, etc.

Templates are just report files (ie: RDL files) but placed in a special folder on your system.

If you're starting with a blank page to create a report, you're doing it wrong.

But where do these templates come from? In the Visual Studio 2015 version of the tools, they were here:

C:\Program Files (x86)\Microsoft Visual Studio 14.0\
Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject

I found after installing the VS2017 version of the tools though, that they weren't there any more. Instead, so far I've found them in different locations on different machines. One was here:

C:\Program Files (x86)\Microsoft Visual Studio\2017\
Enterprise\Common7\IDE\CommonExtensions\Microsoft\
SSRS\ProjectItems\ReportProject

The other was here:

C:\Program Files (x86)\Microsoft Visual Studio\2017\
SQL\Common7\IDE\CommonExtensions\Microsoft\
SSRS\ProjectItems\ReportProject

At one of those locations, you'll find these:

Note the difference is that it either says SQL or it says Enterprise in the path. I presume this depends upon which version of the VS shell you have installed, either the version installed by SQL Server Data Tools, or the full version installed by Visual Studio.

I wish this was less confusing, and I wish there was a standard shared folder that templates could live in ie: shared amongst a group of developers in a standard way.

 

Opinion: Which SQL Server columns should be nullable – no place for Magic values

In  a recent blog post, I wrote about the ANSI_NULLS option for tables and after that I had several more questions related to NULL values.

Note that I mentioned NULL values, not values of NULL, and not values equal to NULL. That's because "being NULL" is a state that a value is in, not itself a value. That's why our queries say IS NULL and not = NULL.

Now because of that, many developers see values that  are NULL as a pain in the neck as they have to be dealt with separately, unlike other values. This can lead to many poor design decisions but I want to start today with the decision to use magic values.

Some developers will just decide that their databases will never have values that are NULL.

The problem with designs like this is if they have a column that should be NULL (because no sensible value could go there), instead they tend to put a magic value there instead.

Image by Jez Timms
Image by Jez Timms

Magic values are values that are stored like any other value but are intended to mean something else.

The problem with magic values is that as soon as you use a real value to represent something that means something different, every piece of software that deals with the database, and anyone who queries the database directly, has to understand the substitution trick. These go wrong all the time. Inevitably, someone doesn't follow and these values have a way of leaking into the real world.

Here's a simple example:

Now we don't know where exactly this went wrong. It could be as simple as the UI not knowing how to display a non-existent value, or the value 1/01/001 was stored instead of the value being NULL. Either way, it shows nonsense. What it no doubt should say is never.

This is very common with dates. I work on systems all the time that assume that instead of NULL dates use 1/1/3000 or 1/1/9000, etc. Invariably these values leak out into the real world.

There is a wonderful story of a country where OCR machines had been introduced to read tax returns instead of humans and, by design, when they couldn't read the values, they inserted all nines. Pity the people writing the next layer of code didn't get the memo. The huge values made their way right into the country's budget calculations before being identified as wrong.

This is surprisingly common. Here's another example:

I'm guessing that they decided to store 9999 for wind speed when they didn't have one, and -999 for precipitation when the value was missing. Again, pity the UI designer didn't know that, or didn't test for that.

Magic values are just one common problem. Next week, we'll discuss another.

 

 

 

 

Shortcut: Import and Export Settings in SQL Server Management Studio

Whenever I need to work on a new laptop or server, or whenever I change versions of SQL Server Management Studio, I kick myself for not remembering to export my settings, so I can import them again.

I spend quite a bit of effort getting SSMS configured the way I want, so it only makes sense to save the settings. Saving them isn't perfect but it's far better than not having done it.

From the Tools menu, choose Import and Export Settings:

As an example, let's export the settings. I'll choose Next:

Notice that it's not an all or nothing export. I can choose details of which settings or groups of settings to export.

In this case, I wanted all of them, so I just need to pick a name and a location:

And next time I change to a different machine or new version, I can just import them and pat myself on the back for remembering.

SDU Tools: Separate T-SQL Strings By Case

If you've ever used SQL Server Reporting Services, you'll notice that when you drag a database column into a table, it auto-magically converts the name of the column into a more English-readable name.

Image from MSDN

Notice how when the LineTotal column has been dragged into the table, the heading has been set to Line Total with a space. What it is doing is taking a Pascal-cased or camel-cased name and separating the words with spaces, based upon where the capital letters are.

In our free SDU Tools, we have a tool that does that as well.

In the example above, you can see both a Pascal-cased phrase and a camel-cased phrase that has been separated.

You can see it in action here:

For more information on our SDU Tools, and to join our SDU Insiders to get them and our other free resources, please just look here:

http://sdutools.sqldownunder.com

 

Opinion: Constant churn breaks community learning for software applications

A current trend that I can't say that I love is constant churn within software applications. I have no interest to go back to the days where we got a new version of SQL Server or Power BI, etc. every few years.

It's also not a case of who moved my cheese?

In fact, I thrive on change. However, I've now become really concerned about how anyone:

  • Learns to use a complex application
  • Remembers how to use a complex application when they don't use it daily

I first really struck this issue with Azure. If I was teaching a class that used Azure, I could check every single lab on Sunday night, then Monday morning, the students would find it had all changed. That's OK for an experienced person, but not OK for a learner.

I love the rate of change for Power BI. We're endlessly getting wonderful new things. But I have to say that every class that I teach on this is like a new experience. I've got another one this coming Tuesday. I used to look forward to them but now I have a major hesitation every time, as I wonder what parts of the labs will have broken.

This is now an ongoing challenge for all this type of software though. I helped create some labs for VSTS very recently, and when I look at the product now, it barely resembles the one that I built the labs on.

Is it better? Probably yes.

But even though it might have been a few months ago, it feels like just the other week, and yet, not only has the UI changed, entire concepts have been added or removed, and the order that things are done in has changed substantially.

I don't know the answer to this but the current rate of churn is a substantial issue.

I gather the plan with the DevOps guys is to put a set of labs on GitHub, and let people who are doing the labs point out the issues day by day as they strike them. Again, for experienced users that might work. But for newcomers, I really wonder if that's what they'll think.

Will they realize the app must have changed, and it's all different, or will they just think the product is too hard to use. Either way, they'll be very frustrated.

Image by JeShoots

And while initial learning the product is one thing, I'm worried about it longer-term. A product like VSTS lets you set up automation and you hope you won't need to change it constantly. But if every time you go to make a change, you're struggling to use it like you're a newbie again, that's a problem.

Finally, I'm really concerned about ongoing support.

The vast majority of support of software applications today happens from community resources like blogs, webcasts, etc.

Will they continue to be created at the same pace if the authors know they'll be irrelevant or wrong within a very short time? How will end-users learn to do things when none of the online examples they find still work?

I wish I knew the answer to this.

SQL: Design – Entity Attribute Value Tables (Part 2) – Pros and Cons

In an earlier post, I discussed the design of EAV (Entity Attribute Value) tables, and looked at why they get used. I'd like to spend a few moments now looking at the pros and cons of these designs.

Let's use the same table as the last time as an example:

Pros

The main positive that's typically described is that the schema is "flexible". By this, the developers usually mean "I don't have to change the database schema (or worse, have someone else change it) when my needs change".

I don't buy this argument. Databases do a great job of managing metadata. Why would your app do it any better?  I'll be there's another reason (perhaps political) as to why this argument is being put forward in most cases.

Another argument is that this type of design handles sparse data gracefully. When we designed the WideWorldImporters database for Microsoft, we had a few edible products. The vast majority of the products were not edible. So the question is about where a use-by-date for an edible item should be stored. Should every product have that column?

Sparse columns in SQL Server deal with that quite well, but in SQL Server 2016, we also wanted to showcase some of the new JSON-based functionality, so we ended up putting this info into  a JSON-based column. I would have been just as happy with a sparse column for this though. AT least that would have allowed us good control over the data type.

Cons

I see many problems with these tables.

The most obvious problem is that a large number of joins is required to get all the related data for an entity if you ever need to process this within the database. Many of these applications, though, just read and write the entire entity each time and aren't concerned about this. These applications though tend to be at the small end of the market and aren't too concerned about performance.

A second key issue is data typing. In the example above, what data type is the Value column? It's most likely some sort of string. If you're one of the cool kids, it might be a sql_variant instead. In this case, let's assume it's a string of some type. How can we then provide any sort of guarantees on what's stored in the column? The LoyaltyGrade is clearly an integer in this example, but there's nothing to stop us putting Gold in that field instead.

Ah, but "the app does that" I'm told. That might be true but apps have bugs, as do ETL processes that put data in those fields. That's also very small-system thinking as most organizations will have a large number of applications that want to access this same data, and they won't even be built on the same technology stack, so lots of luck forcing all the access through your app or services layer. Ever tried to connect something like Excel to a service?

What I often then eventually see is typed columns within the table:

  • StringValue
  • IntValue
  • DateValue
  • DecimalValue

and so on. Then we have to consider what it means if a value happens to be in more than one of these. (Ah yes, the app won't allow that).

Once again though, what you're doing is building a database within the database, just to avoid adding or dropping columns.

Yet another problem is the complete lack of referential integrity. You have no way (or at least no ugly way) to ensure that values lie in a particular domain. Is it OK for Brunette to be entered rather than Brown for the hair color? If not, how do you enforce that? (I know, I know, the app does that).

These are just a few of the issues, and there are many more.

Before I finish for today, I do want to mention yet another design that I saw recently. Instead of one table to rule them all, with an EAV-based-non-design, what this database contained was a table for each data type. Yes there was a string table, an int table, a date table, etc.

It should be obvious that such a design is even nastier.