Mixing UNION and UNION ALL and other oddities seen in consulting

I always say that one of the things that I love about consulting or mentoring work is that I see things (mostly code) that I would have never have thought of, both good and bad. You could spend all day dreaming up bizarre ideas and never come close to the ones that I just happen to come across.

A good example of this was a site I was at a while back where every table had a GUID name. Yes, I’m talking about tables named dbo.[3B38AB7E-FB80-4E56-9E5A-6ECED7A8FA17] and so on. They had tens of thousands of tables named this way. Query plans were close to unreadable.

Another was a site where the databases were named by their location on the disk. Yes, they had a database named X:\Database Files\Data Files\CoreDB.mdf. And yes that does mean that you end up using it like:

image

Not all odd things that I see are so blatant though. Today I saw a more subtle coding issue. With SQL Server the UNION operation combines to rowsets into a single rowset. If UNION ALL is used then all rows are returned. With just UNION without the ALL, only distinct rows are returned. All good so far.

But until today, I’d never stopped to think about what happens when you mix the two operations. For example, without running the code (or reading further ahead yet), what would you expect the output of the following command to be? (The real code read from a table but I’ve mocked it up with a VALUES clause to make it easier to see the outcome).

image

I was left wondering if there was some type of operation precedence between UNION and UNION ALL. The output rows are:

clip_image001

It isn’t a case of precedence. The operations are just being applied in order. You can see this as follows:

Executing the first part:

 

image

returns the following with no surprises:

clip_image002

Executing the first two parts:

image

returns all rows from both queries:

clip_image003

Executing the first three parts:

image

returns the following rows. This is formed by taking the previous result, executing the third query then performing a distinct operation on the whole combined rowset.

clip_image004

Executing the entire query then takes this previous result set and appends (based on the UNION ALL), the results from the fourth part of the query.

clip_image005

Regardless of how it actually works, I think it’s important to avoid writing code where the outcome is less than obvious. In this case, it was just a bug but if the code as written was the intended code, adding some parentheses to this query might have made the intent clearer.

And of course in this case, a sprinkle of extra DISTINCT and GROUP BY operations made it a thing of beauty:

image

which actually returned:

clip_image006

So what they really should have written in the first place was:

image

<sigh>

Data Camp Sydney (Free)

Hi Folks, on the 5th June (yes that’s Friday next week), I’m running a Data Camp day for the local Microsoft team. It’s being held at Cliftons in the city.

We’ve got four topics for  the day:

  • Azure SQL DB
  • Azure DocumentDB
  • Azure Machine Learning
  • Azure Stream Analytics

If you want to get your head around any/all of these, we’d love to see you there. Places are limited but you must register and can do so here: https://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032627085&Culture=en-AU&community=0

Latest MVA Offerings for SQL Server and for Windows 10 (Yes it’s time to start looking at this)

The team at Microsoft Virtual Academy (MVA) have pushed out some new content that’s relevant to database people.

First, if you’re wondering about using Azure for SQL Server, the Jumpstart for SQL Server in Azure VMs is worth a look. Longer term, I suspect we’ll mostly end up using SQL Server as a platform service (Azure SQL DB) but in the short-term, implementing it in a VM will be more common as it’s probably both easier when migrating existing applications and a little more familiar to most.

Next, if you have to deal with other databases (shock horror, yes there are others including open source ones), there is a course on Open Source Databases on Azure.

Finally, you would have to have been living under a rock not to notice that Windows 10 is coming. But now, it’s time to start to get your head around what’s different. There’s a course that covers off the Fundamentals of the Technical Preview of Windows 10.