SQL Interview: #12: Using UNION vs UNION ALL

This is a post in the SQL Interview series. These aren't trick or gotcha questions, they're just questions designed to scope out a candidate's knowledge around SQL Server and Azure SQL Database.

Section: Development
Level: Intro

Question:

Consider the following code:

A: What would be the difference in results if the query was changed to use a UNION ALL instead of the UNION?

B: What are the performance differences between using a UNION ALL and a UNION in a SELECT statement?

C: How could this query be rewritten without the UNION?

Answer:

A: If the query was changed to use UNION ALL, a customer might be returned more than once. A UNION performs a DISTINCT operation on the results of the query. A UNION ALL does not perform the DISTINCT operation on the results.

B: A UNION invariably involves more work and is slower as it needs to perform a DISTINCT operation on the results.

C: One option would be to rewrite it as follows:

 

SQL Interview: #11 Adding a column in the middle of a table's columns

This is a post in the SQL Interview series. These aren't trick or gotcha questions, they're just questions designed to scope out a candidate's knowledge around SQL Server and Azure SQL Database.

Section: Development
Level: Medium

Question:

When you add a column to an existing SQL Server table, the column is added to the end of the table. The column will end up with the highest column_id.

If a developer asks you how a column can be added to the middle of a table instead, what is your advice and how would you approach the request?

Answer:

The first part of the advice is that ideally you won't care what order the columns are in your table. However, I completely understand that a developer might like to see columns grouped appropriately in a list of columns for a table, rather than just a random list of columns.

Unfortunately, SQL Server does not have an option like MySQL (for example), where you can add "AFTER", "FIRST", or "LAST" when defining columns.

With SQL Server, the general approach is to:

  • Copy the data out to a temporary table
  • Drop any foreign key constraints that reference the table
  • Drop and recreate the table with the desired order
  • Copy the data back in from the temporary table
  • Recreate any foreign key constraints that reference the table
  • Drop the temporary table

 

 

SQL: Try the new Cascadia Code as a font in SSMS

The good news is that Visual Studio 2022 has been announced, and if you haven't read the announcements, the big deal is that it's finally a 64 bit tool. You can still build 32 bit apps with it, but the tool is now 64 bit. I hope SQL Server Management Server (SSMS) will also follow it to 64 bit but that's not likely to be known as yet.

However, one interesting part of the announcement is a new font designed to be used for development. It's called Cascadia Code. And even though Visual Studio 2022 isn't available yet, the font is. I've been trying it in SSMS and quite like it. Previously I was using Consolas.

Installing Cascadia Code

It's easy to install. Here are the steps:

  • Download the latest release from here: https://github.com/microsoft/cascadia-code/releases
  • Unzip the downloaded file and from the ttf folder (presuming you're running Windows), copy the files CascadiaCode.ttf, CascadiaCodePL.ttf, CascadiaMono.ttf, and CascadiaMonoPL.ttf.
  • Right click each one and click Install for all users.

Using Cascadia Code in SSMS

Using the fonts in SSMS is easy. From the Tools menu, click Options.

From the Environment section, click Fonts and Colors.

Now the fonts will be available in the drop-down list for Font.

You'll then see a list similar to what's in the main image above. Note that fonts that are shown bold are fonts with fixed spacing (i.e. not proportional fonts), and that's usually what I want when doing development.

 

 

T-SQL 101: #95 Choosing options with CASE in T-SQL

You often need to choose from many outcomes, based upon a specific value. You might want to say if the value is 3 then choose Red, but if the value is 4 then choose Blue, and if the value is 5, then choose Green, etc. The way that you apply this type of logic in T-SQL is by using the CASE statement.

There are several ways you can write CASE statements. In the main image above, I have said that if the value is greater than or equal to 25, then say "Large". If the value was between 20 and 24, then "Medium". Otherwise (for all other values), choose "Small".

The output is as follows:

It's important to understand that the CASE statement checks each condition, before moving to the next condition, and the conditions do not have to be based on the same value. For example, I could say:

The CASE statement stops checking once a condition has been satisfied. In the example above, even if the Size is 370ml, that will not be checked if the OuterQuantity was at or above 25.

Alternate Structure

Another format for CASE can be used when a single value is being checked:

That only works for a simple list of values.

ELSE

You will notice in the example above that an ELSE clause was provided. This says "if none of the conditions above applies, then choose this".

The ELSE clause is optional. If you do not have an ELSE clause, and none of the conditions apply, the return value will be NULL.

SQL Clauses

The most common use of CASE statements is in SELECT clauses but it's important to understand that it can be used in many other places in queries. WHERE clauses commonly use CASE, even ORDER BY can use CASE.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

SQL Interview: #10 System-defined default names

This is a post in the SQL Interview series. These aren't trick or gotcha questions, they're just questions designed to scope out a candidate's knowledge around SQL Server and Azure SQL Database.

Section: Development
Level: Medium

Question:

When you define a column default with code like below:

the system will define the name of the default. Can you give examples of why specifying the name of the default instead of letting the system supply it would be a good practice?

Answer:

There are several reasons. Here are three:

If you ever need to change the default value, you will need to know the name of the default constraint to be able to remove it, before you add a new default. This is much easier if you already know the name of the default.

If you ever need to drop the column, in SQL Server, you must first drop any default on the column. Again, that is much easier if you already know the name of the default. (Note that other database engines like PostgreSQL do not allow you to name defaults, but they also automatically drop them when dropping columns).

If you have created the table in multiple databases and you are using database comparison tools to check for differences, having consistent names in the script avoids the potential detection of a difference. (Note that some comparison tools can ignore system names like these).

 

SDU Tools version 21 is now released for download

Version 21 of our free SDU Tools for developers and DBAs is now released and winging their way out to our SDU Insiders.

You can find details on the tools here.

If you haven't been using SDU Tools yet, I'd suggest downloading them and taking a look. At the very least, it can help when you're trying to work out how to code something in T-SQL.

Along with the normal updates to SQL Server versions and builds, we've added the following new functions:

CreateAnalyticView – this new tool helps to support automating the scripting of analytic views based upon existing data warehouse tables. It will be particularly of use to anyone following our methods in the Implementing Power BI in the Enterprise workshops or upcoming book.

WeekdayAcrossYears – this one was requested by Dave Dustin. It's a new function that returns the day name for a given day and month number for a range of years. (For example, which day of the week will my birthday be for the next ten years?)

SQLServerType – this new function returns the type of server that you are attached to. For example, it can help you to know if you are connected to a SQL Server database or an Azure SQL Database or an Azure SQL Managed Instance.

The following tools have been upgraded:

SplitDelimitedString – this tool now uses a much faster method. It works the same, just much faster.

ListUseOfDeprecatedDataTypes – this procedure has been updated and provides an additional output column that gives you an example of the T-SQL script that you might need to execute to correct the inappropriate data type. (Thanks to Michael Miller for the suggestion)

GREATER and LEAST have been added to the list of reserved words.

The list of exceptions in the ProperCase function has been expanded. (Thanks to John Reitter for the suggestions)

And of course, there are a few corrections included as well.

StartOfFinancialYear – we fixed a scenario where the wrong year could be returned.

ChineseYears – we fixed a typo in the year of the goat (Thanks again to Dave Dustin)

UnixTimeToDateTime2 and DateTime2ToUnixTime – now use bigint values to ensure they'll avoid the Unix 2038 problem. (Thanks to Bob Roberts for this one)

Scripting functions that output column lengths for nchar and nvarchar were corrected.

Check out all the available functions, procedures, and views here.

SQL Interview: #9: Computed columns in table definitions

This is a post in the SQL Interview series. These aren't trick or gotcha questions, they're just questions designed to scope out a candidate's knowledge around SQL Server and Azure SQL Database.

Section: Development
Level: Medium

Question:

Consider the following code:

Will the CreatedDate column return the same value in both SELECT statements?

Answer:

When you define a computed column, the value is calculated when it is SELECTed. In this case, the two SELECT operations will occur at different times, and different values will be returned for that column.

It is possible with some computed columns to add the term PERSISTED. In that case, the value is calculated at INSERT or UPDATE but is then stored. The same value would be returned every time it is SELECTed.

However, in this case, you cannot apply PERSISTED to the computed column, as the expression SYSDATETIME() is not deterministic.

 

T-SQL 101: #94 Returning messages to the client by using PRINT and RAISERROR in T-SQL

There are times when you're writing in any programming language, that you want to output messages back to the client. In T-SQL, the statement provided for that is the PRINT statement.

PRINT is simple enough to use. You give it a  string to return and it prints that string.

It's different to the SELECT statement, because the SELECT statement returns a rowset (i.e. a set of rows).

In the main image above, you can see a script with both a PRINT and a SELECT. When that script is executed, two things happen. The output of the SELECT statement is returned in the Results tab in SQL Server Management Studio as below:

But notice that the PRINT output isn't there. Instead, it's returned in the Messages tab as below:

Using RAISERROR instead of PRINT

PRINT is actually a specific instance of the RAISERROR statement. Instead of using PRINT, you could use RAISERROR like this:

Notice that the output is identical. The first parameter to RAISERROR is the message, the second is the severity, and the third is the state. In this case, any severity from 1 to 10 would have worked. You can just make the state be 1 and ignore it until we discuss RAISERROR in much more detail later.

RAISERROR and NOWAIT

So why would you ever use RAISERROR instead of PRINT? After all, PRINT is simpler.

The answer is related to when the messages are returned. For example, if you put PRINT statements in a stored procedure, it's only after the stored procedure completes that you'll see the output.

If however, you use the NOWAIT option in RAISERROR, the messages come back immediately.

This can be very useful when you are debugging code.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

SQL Interview: #8: Why should you avoid the use of db_datareader and db_datawriter?

This is a post in the SQL Interview series. These aren't trick or gotcha questions, they're just questions designed to scope out a candidate's knowledge around SQL Server and Azure SQL Database.

Section: Security
Level: Intro

Question:

Many applications connect to SQL Server databases with users that are members of the db_datareader and/or db_datawriter roles?

Why should you avoid using those roles?

Answer:

db_datareader and db_datawriter are fixed database roles that were provided for convenience. They aren't a good option from a security perspective.

Whenever you add a user (or group) to any fixed role, you are assigning them a collection of permissions. By definition, that list of permissions will need to include at least all the permissions they require.

However, invariably when you use fixed roles, you are also assigning additional permissions or access to objects that aren't required.

A more secure solution is to create a role and grant it only the permissions that are required, and then assign the user (or group) to that new role.

 

SQL Interview #7: Are statement terminators ever required in T-SQL?

This is a post in the SQL Interview series. These aren't trick or gotcha questions, they're just questions designed to scope out a candidate's knowledge around SQL Server and Azure SQL Database.

Section: Development
Level: Intro

Question:

Many versions of SQL require every SQL statement to be terminated with a semicolon.

Since it was created, T-SQL has listed statement terminators as optional.

Are there any situations where they are required, and not optional?

Answer:

Since SQL Server 2005, some T-SQL statements must be separated from previous statements by a statement terminator.

The statements involved are WITH, SEND, RECEIVE.

In addition, a MERGE statement must have a statement terminator.

Using statement terminators is generally considered good practice, even though many of the Microsoft-supplied tools generate scripts without them.

Ever since SQL Server 2005, the product deprecation list has stated that the optional nature of statement terminators is deprecated. For this reason alone, you should use statement terminators.

But there are other potential issues. Consider this nasty example from fellow MVP Erland Sommarskog:

What is not immediately obvious is that THROW would never be executed. Instead, it would be interpreted as a column alias for ERROR_MESSAGE().

Using semicolons as statement terminators avoids this type of issue.