Snowflake SQL for SQL Server Users - Sequences: The good and the not so good

I’ve been working to convert our SQL Down Under sample databases like PopkornKraze across to Snowflake. One of the areas that I ran into real challenges with was sequences.

The good news

The syntax for creating and accessing them is basic enough.

CREATE SEQUENCE "SequenceName";

is enough. (Note the double-quotes are mine as I don’t want the name auto-wrapped to SEQUENCENAME).

And there are some of the same options that SQL Server has:

START WITH = 1 (you can even leave out the equals sign)
INCREMENT BY = 1 (again the equals is optional)
COMMENT = ‘some comment’

Snowflake syntax also supports IDENTITY(seed, increment) if you want to define that at a table level. That is like a single table sequence but might help in migrating SQL Server code.

What I do really like are the options for replacing or only creating if not exists:

CREATE OR REPLACE SEQUENCE "SequenceName";

CREATE SEQUENCE "SequenceName" IF NOT EXISTS;

We can’t do that in SQL Server today. (I’ve complained to the SQL Server team that these types of options don’t exist for sequences).

Getting the next value for a sequence is easy enough:

SELECT "SequenceName".NEXTVAL;

Like SQL Server, there’s no guarantee that you won’t have gaps in your numbers that are allocated.

There is a table-level function called GETNEXTVAL() that takes the sequence name as a parameter. It’s a one row table function that generates a unique value. It’s used for precise control of sequence generation when many tables are involved.

The not so good news

First up, there’s no concept of anything like currval in Oracle (current value), or @@IDENTITY, or SCOPE_IDENTITY() in SQL Server. The documentation argues that this is an intentional omission as it encourages row by row coding. Can’t say I buy that argument.

The biggest challenge (by far) is that you can’t reset the next value to be issued by either an IDENTITY or Sequence. This really needs to get fixed. Lots of people are complaining about it in the Snowflake Lodge (user forums) and it just isn’t a good thing. Here’s an example:

You define a table using an IDENTITY column or a Sequence, let’s say for a primary key.
You need to load existing data into it. This works easily as there is no concept of something like SET IDENTITY INSERT ON/OFF. You can just insert.
Next row that gets added to the table will now probably have a duplicate primary key value added.
And as Snowflake also doesn’t check primary keys (or most other constraints), you’ll really just end up with multiple rows with the same primary key.

ALTER SEQUENCE does allow you to change the increment, but won’t let you change the next value to be issued. (Or the seed).

You have to drop and recreate the sequence. And now here’s the even-nastier part: if you drop the existing sequence, it doesn’t stop you doing it, but the column that was using it still references it. So even when you recreate a sequence with the same name, your next insert will blow up with an object not found. (Again, people complaining about this in the Snowflake lodge).

This means that you can’t add sequences as defaults to tables until you’ve finished loading all your data. That’s just not sensible, and breaks all the rules about separating DDL and data. The DDL that you write for a table should not be dependent upon what data happens to be loaded into a table.

How could you then write DDL that’s used in multiple places? (Unless you just always start with a gigantic seed value).