T-SQL 101: #32 Repeating T-SQL batches with GO n

In my last T-SQL 101 post, I mentioned that GO was just a word that's used to separate batches. In fact, it's not a SQL word at all. If you actually send the word GO to the server, it wouldn't know what you're talking about.

I mentioned that the word is only understood by the program that you type the script into. SSMS also allows us to make use of another additional trick in relation to GO.

I can add a number after the word GO, and SSMS will execute the batch that number of times.

Here's an example:

First up, notice that the 10 is underlined with a red squiggle. The T-SQL intellisense has no idea what's going on with the number.

But when we execute it, SSMS outputs Beginning execution loop, then it sends the batch containing PRINT 'HELLO'; to the server 10 times (one at a time), then prints Batch execution completed 10 times.

SSMS is doing all the work. SQL Server just sees 10 separate batches sent to it.

This can be useful if you need to run a command many times.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

Learning Mandarin: Five – about me?

This is the sixth in a series of posts where I'm looking at how the Chinese view numbers, often from a superstitious basis.

Five

The Chinese character for five (Wǔ) is shown on the right hand side of the main image above.

The number five is slightly on the lucky side of neutral in Chinese, where people believe there are five blessings: luck, prosperity, wealth, longevity, and happiness.

Similarly, it has been associated with the five elements: water, earth, fire, wood, and metal. There is some past association with emperors.

It sounds a bit like the word 吾 (Wú) which means "I", "my", or "me".

While that's all good, it also sounds like 无 (Wú) which is more like "nothing" or "not" or "without". That has the possibility of being either good or bad.

Learning Mandarin

I'll write more soon on the best methods for learning. If you want to get a taste for it in the meantime though, my current favorite site is iTalki, and my favorite teacher by far is Amy He. If you decide to try it, click here and it's cheaper for both you and me.

SDU_FileSplit – Free utility for splitting CSV and other text files in Windows

When I was doing some Snowflake training recently, one of the students in the class asked what utility they should use on Windows for splitting a large file into sections. They wanted to split files for better bulk loading performance, to be able to use all available threads.

On Linux systems, the split command works fine but the best that most people came up with on Windows was to use Powershell. That's a fine answer for some people, but not for everyone.

Because the answers were limited, and people struggled to find a CSV splitter, I decided to fix that, and create a simple utility that's targeted as exactly this use case.

SDU_SplitFile

SDU_SplitFile is a brand new command line utility that you can use to split text files (including delimited files).

Usage is as follows:

SDU_FileSplit.exe <InputFilePath> <MaximumLinesPerFile> <HeaderLinesToRepeat> <OutputFolder> <Overwrite> [<OutputFilenameBase>]

The required parameters are positional and are as follows:

<InputFilePath> is the full path to the input file including file extension
<MaximumLinesPerFile> is the maximum number of lines in the output file
<HeaderLinesToRepeat> is the number of header lines to repeat in each file (0 for none, 1 to 10 allowed)
<OutputFolder> is the output folder for the files (it needs to already exist)
<Overwrite> is Y or N to indicate if existing output files should be overwritten

There is one additional optional parameter:

<OutputFilenameBase> is the beginning of the output file name – default is the same as the input file name

Example Usage

Let's take a look at an example. I have a file called Cinemas.csv in the C:\Temp folder. It contains some details of just over 2000 cinemas:

I'll then execute the following command:

This says to split the file Cinemas.csv that's currently in the current folder with a maximum of 200 rows per file.

As you can see in the previous image, the CSV has a single header row. I've chosen to copy that into each output file. That way, we're not just splitting the data rows, we can have a header in each output file.

We've then provided the output folder, and also said Y for overwriting the output files if they already exist.

And in the blink of an eye, we have the required output files, and as a bonus, they're all already UTF-8 encoded:

Downloading SDU_FileSplit

It's easy to get the tool and start using it. You can download a zip file containing it here.

Just download and unzip it. As long as you have .NET Framework 2.0 or later (that's pretty much every Windows system), you should have all the required prerequisites.

I hope you find it useful.

Disclaimer

It's 100% free for you to download and use. I think it's pretty good but as with most free things, I can't guarantee that. You make your own decisions if it's suitable for you.

 

Snowflake for SQL Server users – Part 3 – Core Architecture

The first thing to understand about Snowflake is that it has a very layered approach. And the layers are quite independent, including how they scale.

Cloud Provider Services

The lowest level isn't part of Snowflake; it's the services that are provided by the underlying cloud provider. As a cloud native application, Snowflake is designed to use services from the cloud provider that they are deployed to, rather than providing all the services themselves. At present, that means AWS or Microsoft Azure. Deployment on Google's cloud platform is in preview at this time.

Each deployment of Snowflake includes the upper three layers that I've shown in the main image above.

Storage Layer

This layer is exactly what it says. It uses storage from the cloud provider to provide a way to store anything that needs to be persisted. That includes, the obvious things like databases, schemas, tables, etc. but it also includes less obvious things like caches for the results of queries that have been executed.

Storage is priced separately from compute. While the prices differ across cloud providers and locations, the Snowflake people said they aren't aiming to make a margin much on the storage costs. They're pretty much passing on the cloud provider's cost.

In addition, when you're staging data (that I'll discuss later), you can choose to use Snowflake managed storage or your own storage accounts with a cloud provider to hold that staged data.

Compute Layer

This is the heart of where Snowflake make their income. The compute layer is made up of a series of virtual clusters that provide the compute power to work on the data. Importantly, each of the virtual clusters (called Virtual Warehouses) can independently access the shared underlying storage.

Compute is charged by consuming what are called "credits". I'll write more about those soon. What is interesting is that you only pay for compute while a virtual warehouse is running.

While there are some blocking commands, the idea is that you should do most of those in staging areas, to keep your shared storage accessible in a highly concurrent way. The aim is to have a virtual warehouse happily querying the data, while another virtual warehouse is in the middle of bulk loading other data. And the virtual warehouses can each be different sizes.

Global Services Layer

This is the layer where all the metadata and control lives. It's also where transactions are managed, and where security and data sharing details live. I'll describe each of its functions in future posts.

 

SDU Tools: Create SQL Server Login with SID from a database

In SQL Server, both logins (access to the server) and users (access to a database) have a name and a security ID (SID). This leads to problem situations where names might match but SIDs don't match.

Common Problem

I've lost count of the number of times I've seen a user restore a database from another server, and then realize that the SQL Server login they need wasn't present. Then, they create a new login and end up in a lousy situation, because the new login's SID doesn't match the SID of the user in the database.  I call this a mismatched SID issue.

The error message will tell you that the user doesn't exist in the database, but when you try to add the user, it will tell you it already exists. That's the sort of thing that makes people want to throw their mouse through their screen.

Microsoft Solution and why it's not enough

There is a Microsoft solution for fixing this problem.  For a long time, the command that people used was:

sp_change_users_login

Lot's of people still use that command. If you're one of the cooler kids, you'll know that the current way to fix a user is to instead do this:

ALTER USER Blah WITH LOGIN Blah;

It's described here.

What these commands do though, is change the database SID to match the one for the login. I think they're fixing the wrong problem.

The problem with this solution though, is that most of these restores aren't one-offs. And when the next restore is done, you'll be back in the same problem.

Avoiding the Issue

Now there is of course a way to have avoided this issue in the first place. Instead of just creating a new login and getting a new SID, you could have specified the SID when creating the login. Then it would match and there wouldn't be an issue.

You could have retrieved the SID by scripting it out when you scripted the login (if you scripted it – most people with this problem haven't).

Another solution

Another solution would be to retrieve the SID from the database that you can't access and use it for creating the new SQL login.

In our free SDU Tools for developers and DBAs, we have a tool called CreateSQLLoginWithSIDFromDB that does just that. It's a long name but it does precisely what it says.

It is quite configurable and has a few parameters (many of them optional):

@SQLLoginName sysname -> Name of the login to create
@Password nvarchar(128) -> Password to assign
@SIDDatabaseName sysname -> Database to retrieve the SID from
@DefaultDatabase sysname -> (Optional) default database for the login
@DefaultLanguage sysname -> (Optional) default language for the login
@IsCheckExpiration bit -> (Optional default 1) is expiration to be checked?
@IsCheckPolicy bit -> (Optional default 1) is policy checked?
@SIDDatabaseUserName sysname -> (Optional default @SQLLoginName)

Find out more

You can see an example of calling it in the main image above, and see it in action in the video here:

Access to SDU Tools is one of the benefits of being an SDU Insider, along with access to our other free tools and eBooks. Please just visit here for more info:

http://sdutools.sqldownunder.com

 

Opinion: Please let me have a clean desktop

I might be a bit anal about this but I really dislike clutter all over my computer's screen desktop. You'll notice above that I keep it empty. If there's something on my desktop, it's something that I'm working on really briefly. If I need to do a presentation, I might have a single folder called Desktop that I'll sweep anything on the actual desktop into.

There are two things that drive me crazy in terms of desktop clutter though:

The first is applications that insist on putting icons or shortcuts on your desktop during installation. Hello, it's not 1981 any more.

And worse, if you delete the shortcuts,those applications often put them back every time they are auto-upgraded.

Application writers: Please don't do this.

The other issue is around IT departments. I spend a lot of time working at client sites. And on many of the corporate desktops that I'm supplied to work with, the IT department have both plastered shortcuts all over the desktop, and worse, made it so that I'm not able to delete them.

IT departments: Please don't do this.

T-SQL 101: #31 Understanding batches, scripts, and GO

The image below shows a T-SQL script.

It's an example of contents of a file that you could have on your disk so the entire thing could be called MyProgram.sql. Inside this script file, we've got a few different types of things happening here.

The word script applies to all of the contents of the file.

You can see the word GO there a number of times. It's a special word that's not actually a T-SQL keyword. It's a batch separator. It only has meaning to the tool you're using to execute queries, like SQL Server Management Studio or Azure Data Studio.

When you execute a script, it might look like it's all sent to the server at once, but that's not what happens. SSMS finds the word GO, breaks the script up into batches, and sends them to the server, individually.

So, in this case, the following four batches are sent, one batch at a time:

  1. USE master;
  2. CREATE DATABASE Interesting;
  3. USE Interesting;
  4. CREATE TABLE dbo.Information
    ( InformationID int,
    Thoughts varchar(20)
    );
    INSERT INTO dbo.Information
    VALUES (1, 'Hello');

We describe Batch 4 as having two statements, the CREATE TABLE statement, and the INSERT statement.

What is GO really?

I mentioned that GO is a batch separator. You could actually use almost anything instead of GO. In the Tools>Options>Query Execution>SQL Server section in SSMS, you could change it to something else.

I'd suggest that you avoid changing it because you want your scripts to run on other machines and programs as well, and GO is the standard batch separator.

I have a Star Trek friend. All his scripts have the word ENGAGE instead of GO.

And of course a nasty trick for new DBAs is to change it to the word SELECT and watch them try to work out what's wrong.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

Learning Mandarin: Four – death…

This is the fifth in a series of posts where I'm looking at how the Chinese view numbers, often from a superstitious basis.

Four

The Chinese character for four (Sì) is shown on the right hand side of the main image above.

Four is universally regarded as a bad number in Chinese culture. Once again, that's because it sounds like other words.

In particular, four sounds like 死 (Sǐ) which means "death". It's similar in Cantonese where the word is sei, it's the same issue. It's considered an unlucky number because of this.

You'll find buildings in Asia that don't have a floor 4 (similar to how many Western buildings years ago had no floor 13). Some buildings take it even further and have no floor 4, 14, 24, 34, and so on.

Four is so frowned upon that people avoid saying it, and don't display it anywhere. People will get upset if they're issued an ID or credit card that has the number 4 in it.

Learning Mandarin

I'll write more soon on the best methods for learning. If you want to get a taste for it in the meantime though, my current favorite site is iTalki, and my favorite teacher by far is Amy He. If you decide to try it, click here and it's cheaper for both you and me.

Book Review: Power BI MVP Book

Over the last few months, one of my Kiwi buddies (and fellow member of both the MVP and Microsoft Regional Director programs) Reza Rad has been organizing a bunch of us to write a book that's a collection of ideas from a number of MVPs. It's the Power BI MVP Book.

There are a whole lot of authors from a whole lot of different countries: Reza Rad, Anil Maharjan, Indira Bandari, Liam Bastick, Ken Puls, Jesus Gil, Thomas LeBlanc, Ike Ellis, Matt Allington, Leila Etaati, Markus Ehrenmüller, Ashraf Ghonaim, Eduardo Castro, Manohar Punna, Treb Gatte, Gilbert Quevauvilliers, Michael Johnson, Shree Khanal, Asgeir Gunnarsson, Greg Low, Gogula Aryalingam.

I've done these types of books before with the SQL Server MVP Deep Dives pair of books. They are a different book in that you're not getting a single story throughout the book. Instead, you're getting a whole set of independent chapters on a variety of topics related to Power BI.

The general idea of these books is to support a charity, and that's where anything that I would have earned from them is going.

Bottom line?

I hope you find this book useful. There is both a paperback and a Kindle edition. The Kindle eBook is far cheaper.

Greg's rating: you decide

Note: as an Amazon Associate I earn (a pittance) from qualifying purchases but whether or not I recommend a book is unrelated to this. One day it might just help cover some of my site costs. (But given the rate, that's not really likely anyway).

Snowflake for SQL Server users – Part 2 – Cloud First Design

In recent years, I've done a lot of work in software houses (Microsoft calls them ISVs or Independent Software Vendors). Many of these software houses have worked out that they won't be able to just keep selling their on-premises applications because their customers are asking for cloud-based solutions.

And more importantly, the customers want the software houses to manage the applications rather than themselves. So, many of the software houses start trying to turn their on-premises applications into Software as a Service (SaaS) applications.

And the outcome? Is almost always really, really poor.

Cloud First

The real problem is that those applications haven't been written with a cloud-first mentality. They really are often the "square peg in a round hole". What often happens is that the cloud-based versions of the applications have large numbers of limitations compared to the on-premises versions of the same applications, because recreating everything in the cloud (when there were things that weren't designed for the cloud) is often nearly impossible.

I'm a big fan of Azure SQL Database, and they've done a great job on it (way better than almost any other application), but it's still quite a distance short of where we already were with SQL Server on-premises. I wish the marketing for the product would focus on how what is there (i.e. a truly amazing product) but the discussion always seems to be around what's missing, and how it compares to the on-premises product. In fact, I'm sure the entire reason that the Managed Instance versions of Azure SQL Database appeared were to address some of the shortcomings.

If the first time you'd seen SQL Server was to see Azure SQL Database, you'd come away saying how amazing it is. But if you've come from the on-premises product, chances are that you might be already using something that isn't there.

Nice to have a blank slate

Even if you were brand new to Azure SQL Database though, you'd find aspects of how the product is designed that are based on thinking from decades ago, and were designed for systems that were available back then. It's very hard to make major changes when you're bound to trying to keep backwards compatibility.

One key advantage that the team building Snowflake had was a clean slate where they could design a product that targeted cloud provider based services under the covers, instead of on-premises physical devices and operating systems.