SQL: Why I prefer sequences to IDENTITY columns

SQL: Why I prefer sequences to IDENTITY columns

I posted the other day that I don’t like negative ranges used for IDENTITY columns. I feel the same about the range of values for sequences, but when I mentioned that I prefer using sequences, a few friends asked why I preferred them.

Let me start by saying that I was surprised with the previous post, that the strong responses were about using natural vs surrogate keys. That had nothing to do with what I was discussing, and is an endless topic for another day. The issue was about when someone has already decided to use an IDENTITY or a sequence. And so, the same point applies to this discussion.

IDENTITY

IDENTITY columns have been around forever in SQL Server. They are convenient, but they have limitations.

They are documented here.

  • They are table-bound i.e. they are part of the definition of a table. There are positive aspects to them being part of a table’s definition. They only apply to a single table.
  • You can’t generate a value or a range of values, without inserting  rows into a table.
  • It’s possible, but hard to change the increment. (You can do it via partition switching but that’s very messy).
  • You have to use DBCC just to perform basic operations on them.
  • You need to use SET IDENTITY_INSERT ON (and later OFF) when inserting values for them, as often is needed when moving data between tables.
  • IDENTITY_INSERT doesn’t work across linked servers. This is a show-stopper for us in some situations.
  • The step is an increment i.e. it can’t be a decrement.
  • You have very limited control over caching of the values.
  • You can’t set MAX and MIN values, or whether the value CYCLES to the MIN when it goes past the MAX.
  • You can only have one per table. (That’s usually enough)
  • They can be used for tinyint, smallint, int, bigint, decimal(p,0), or numeric(p,0) data types.

Sequence

Sequences have been part of SQL Server since SQL Server 2012 and they are documented here.

  • They are schema-bound i.e. they are objects in a schema, just like tables are.
  • They can be used for multiple tables. For example I could have a single BookingID sequence and use it in a FlightBookings table, a CarBookings table, and a HotelBookings table. I then get a single sequence of values across all those tables.
  • They are easy to modify without the need for cryptic DBCC commands.
  • They work as expected over linked servers.
  • You can generate one without inserting a new row first (even outside a transaction)
  • You can generate a range of values by using sp_sequence_get_range
  • The step value can be a decrement if desired (Not sure I’ve ever used this though).
  • Could could use more than one per table if that makes sense for you.

Common

People often wonder about performance. Basically it’s the same if the caching is the same. Caching is used for both but you have more control over caching for sequences. For caching of IDENTITY values, SQL Server 2017 and later, and Azure SQL Database have an option to set IDENTITY_CACHE = ON | OFF.

Caching can lead to gaps on restarts, failovers, etc. You shouldn’t assume that either won’t have gaps. Disabling the caching of IDENTITY makes that less of an issue. If you want to do that at the server level, Trace Flag 272 can be set to get the same outcome.

Summary

Basically, apart from being directly part of the table definition, I now don’t find any redeeming features about IDENTITY columns. I use sequences all the time.

2020-04-17