Snowflake SQL for SQL Server Users – SQL Language Differences

I posted last time that I'd completed my Part 1 eBook on Core Concepts. The Part 2 eBook is on the differences in the SQL language between Snowflake and SQL Server.

This next series of blog posts will focus on the query language, particularly on DML (data manipulation language). DDL (data definition language i.e. CREATE, ALTER, DROP) differences will be covered in the Part 3 series on administration.

Snowflake has quite a rich SQL language but it's quite different to the T-SQL offered by SQL Server in many ways.  I'm planning to work right through the Snowflake SQL language documentation and try everything in it, so you don't have to.

In this series of posts though, I'll look primarily at what's different, not on what's the same. (Detailed coverage of the language itself will be in our upcoming online course on Snowflake SQL. It's being built now).

This post will serve as an index to the other posts in this series.

You'll find the other posts in the list below:

Part 1: Sequences: The good and the not so good

 

 

 

SQL: Find nearest enabled parent in a hierarchy using T-SQL

I was recently answering a forum question about working with a hierarchy. The poster had a table of nodes but importantly, nodes could be disabled. If that happened, the children of the node would move up to their nearest parent that wasn't disabled.

That is, if their parent was disabled and their grandparent was also disabled, then their great-grandparent would become their effective parent.

I provided the following code example, and I hope it helps someone.

The approach I took was to use a recursive CTE to traverse the nodes. As I passed through each node level, I'd add to a NodePath only if the node wasn't disabled. That way, I built up a NodePath of just the active nodes leading to the particular node.

To make it easier to then extract the parent node from the path, I added a function to do that. It reversed the string to make it easy to start at the end and find the nodes. It discarded the first node (the current node), then found the next node, reversed the string again, and returned it as an int.

You should be able to copy/paste the whole code and try it. I hope it helps someone or gives you some ideas.

BI: Wondering where you Integration Services Connector for Power Query has gone in Visual Studio 2019?

I wrote recently about menu items being missing from SSIS and other BI projects in VS2019, but another thing that's gone MIA is the Integration Services Connector for Power Query.

This is the connector that allows you to use Power Query (and the M language) to input data into SSIS.

The SSIS designer for VS2019 recently went GA (general availability). It's great that it's released, but if you install it, you'll find it removes the Power Query option for SSIS.

I imagine the reason for that is that the connector is still in preview, where the rest of the designer is now GA.

To get the functionality back, you need to separately download the Integration Services Connector for Power Query. You'll find both 2017 and 2019 versions of it here:

https://www.microsoft.com/en-us/download/details.aspx?id=100619

 

BI: Wondering why your SSRS, SSAS, and SSIS menus are missing in Visual Studio 2019?

If you've recently installed Visual Studio 2019 (VS2019), and then installed the SQL Server Reporting Services (SSRS), the SQL Server Analysis Services (SSAS), and/or SQL Server Integration Services (SSIS) extensions, when you open a project you might be wondering where your menus went.

How it was

For example, if you had a report open in the VS2017 report designer, this menu was present:

And if you had an AS model open in the VS2017 tabular designer, these menus were present:

And similarly, if you had an SSIS package open in the VS2017 designer, these menus were present:

 

How it now is

Now, when you open VS2019, no matter which of these you have open, you won't find those menus.

So where are they?

They're all hidden within the "EXTENSIONS" menu as these designers are all "Extensions". And if there were multiple menus, they're all in there separately:

Verdict?

I think this is a lousy idea for multiple reasons.

First, it means that when you're working on an object like a tabular data model, your menu bar is full of things unrelated to what you're working on, and the relevant menu items are hidden multiple levels down under "Extensions".

While I understand they're probably trying to keep the VS menu bar consistent, many people working with these projects could not care less about the rest of VS. That's just where these designers happen to be.

For the life of me, I can't see how that's good UI design.

Second, it makes the decades of blog posts and articles that have been written about these tools incorrect, and much harder to follow. The pain from this is hard to describe, and it's felt most by newcomers who will just be puzzled and not know what's going on.

I'd love to hear your thoughts. If you agree with mine, please tell Microsoft as well. I really hope they reconsider this.

 

 

SDU Tools: Script User Defined Database Role Permissions in SQL Server

As part of our free SDU Tools for developers and DBAs, we have many scripting functions. SQL Server allows you to create your own database roles, and generally, you should do this instead of using the fixed database roles like db_owner, or db_datareader and db_datawriter, as it lets you allocate just the required permissions. To allow scripting out the permissions assigned to these roles, we've added a tool called ScriptUserDefinedDatabaseRolePermissions.

It's a stored procedure (as it has to change databases) and takes a single parameter: @DatabaseName (sysname) as the name of the database whose roles you want to script.

You can use our tools as a set or as a great example of how to write functions like these.

Find out more

You can see it in action in the main image above, and in the video here:

Access to SDU Tools is one of the benefits of being an SDU Insider, along with access to our other free tools and eBooks. Please just visit here for more info:

http://sdutools.sqldownunder.com

 

T-SQL 101: #50 Using CHARINDEX and PATINDEX to find one string in another in SQL Server

It's useful to be able to find one string within another string. T-SQL has two basic functions for that: CHARINDEX and PATINDEX.

CHARINDEX is used to see if one string appears within another string. Take a look at this code:

In this first example, I asked SQL Server to look for the string lot in the string Hellothere. You can see that it found it at position 4.

But the second string lod wasn't found, so it returned position 0 to indicate that it could not find it.

Note that there is an option additional parameter that tells you which character position to start looking at.

PATINDEX

PATINDEX is a bit more complex:

PATINDEX is basically "pattern index". We define a pattern that we're looking for and it returns the position of the first place where it occurs. The patterns are the ones we saw in the LIKE operator previously.

% is a wildcard that matches anything or nothing, ^ says "not" and anthign in square brackets is a list of values, and you can also include a range like I did in this example, where I asked for any character between 0 and 4.

The first digit was 5 but that's not between 0 and 4, so it returns position 6 which is there the 4 character is located within the string.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

Snowflake for SQL Server Users – A Christmas Present

I've had a lot of great feedback on my series of blog posts about Snowflake, and explaining the concepts for an existing SQL Server audience.

Just out for Christmas is a new free eBook for our SDU Insiders. I'm preparing a three part series on Snowflake for SQL Server users:

  • Part 1: Core Concepts (available now)
  • Part 2: SQL Language (coming soon)
  • Part 3: Administration (sometime after that :-))

You can get part one now! And did I mention free? (for SDU Insiders)

You'll find it here:

http://snowflakeforsqlserver.sqldownunder.com

Enjoy !

SQL: Violation of SQL Server UNIQUE KEY constraint but entry doesn't already exist

One issue that comes up time and again in the forums is when an INSERT statement fails with a violation of a UNIQUE or PRIMARY KEY constraint but when the user checks the existing table, the value that's being complained about isn't already in the table.

The Symptom

For example, a table might have primary key values of 2, 12, and 14.

When an INSERT is performed, there is an error telling you that you can't insert a duplicate key value of say 15.

You check the table and there is no row with a primary key value of 15.

The same issue can happen with UPDATE statements.

What Usually Causes This

Whenever you see this, the issue is almost always that the duplicate values exist in the data that you're trying to insert, and that's why it can't insert.

In the example above, it would be because the input rows contain two or more rows where the primary key value is 15.

Often this will be caused by a join that is producing more rows than planned.

 

 

T-SQL 101: #49 Using REPLICATE to repeat strings in SQL Server

REPLICATE is an easy function to work with. It takes a string and a number of times you want it repeated. Here's an example:

In this case, I've asked it to replicate (i.e. duplicate many times) an A character. I've asked for 5 of them. I put an X at each end so you could see the effect more clearly.

There are many things this can be used for, but for example, if I wanted to draw a line with 100 dashes, I could just ask for REPLICATE('-', 100).

Extra-long Strings

One challenge with this function though, is that the output data type is the same as the input data type. So even though I can have strings in T-SQL of up to 2GB in size, varchar was limited to 8000 characters. This means that if I wrote

I would only get 8000 characters.

But if I forced it to varchar(max), like this:

then I would get 10,000 A's.

Learning T-SQL

It's worth your while becoming proficient in SQL. If you'd like to learn a lot about T-SQL in a hurry, our Writing T-SQL Queries for SQL Server course is online, on-demand, and low cost.

Snowflake for SQL Server Users – Part 20 – Encryption within Snowflake

As with most products today, Snowflake has substantial handling of encryption to protect client data.

All editions claim to provide "Always-on enterprise grade encryption in transit and at rest".

In Transit

First, for connections, HTTPS and TLS 1.2 are used throughout the system.

If the customer is using external stages (holding locations in cloud-storage accounts), the data can be encrypted while stored in the stage, and then travel encrypted into Snowflake. To do that, you create an "encrypted stage" where you tell Snowflake the master encryption key (i.e. the client side key) when creating the stage:

This is useful in allowing you to create an end-to-end encryption while ingesting data into Snowflake.

At Rest

Regarding "at rest", I've previously described how micropartitions are used to store data in data files. These data files are then encrypted (using AES-256) before being stored.  But how it's handled changes with different editions of Snowflake.

There are four levels of keys used:

  • The root key
  • Account master keys
  • Table master keys
  • File keys

With Standard and Premier editions, a different file key is used each month (aka key rotation). So a single "active" key is used to encrypt your first month's data, then a different key is used to encrypt the next month's data, and so on. Previous keys that are no longer "active" (aka retired) are only used for decryption.

Annual Rekey

Enterprise edition and above offer "Annual rekey of all encrypted data". I was a bit puzzled about how that worked at first. I imagined that once a year, all the existing data would get decrypted and rekeyed. I was thinking that would be quite an expensive operation on a large data warehouse.

What happens instead, is that when any key has been retired for a year, any data encrypted by it is decrypted and re-encrypted using a new key. So if you have those editions, if you store data this month, it will have a single key. In a year and one month's time, it will get rekeyed. And again in two years and one month's time, and so on.

Business Critical

The Business Critical edition takes things further. As well as providing  HIPAA and PCI compliance, it offers "Tri-Secret Secure using customer-managed keys". In this case, Snowflake combines the key that you provide with a Snowflake-maintained key to create a composite master key.

As well as having the obvious benefits in letting you manage access to the data more closely, you can immediately remove all access if needed (i.e. closing an account, or recovering from a breach).

For an index to all posts in this series, see the first post here.