Cannot determine next value for a sequence

One of the most anticipated new features in SQL Server 2012 was the introduction of sequences. Prior to SQL Server 2012, developers had a choice of IDENTITY columns or a roll-your-own table mechanism.

Sequences allow us to create a schema-bound object that is not associated with any particular table. For example, if I have a Sales.HotelBookings table, a Sales.FlightBookings table, and a Sales.VehicleBookings table, I might want to have a common BookingID used as the key for each table. If more than the BookingID was involved, you could argue that there is a normalization problem with the tables but we’ll leave that discussion for another day.

Recently when working with sequences however, I found a problem with their implementation. It works as described but is not useful.

So let’s start by creating the schema and the sequence:

image

 

We could then use this schema as the default value for each of the three tables:

image

All this is as expected. One question that often arises though, is "how do I know the last value for a given sequence". The answer provided is to query the sys.sequences view. We can do this as follows:

image

The current_value colum in sys.sequences is defined as follows:

Datatype: sql_variant NOT NULL

The use of sql_variant here makes sense as the view needs to be able to provide the current value for all sequences, regardless of data type. Sequences can be created with any built-in integer type. According to BOL, the possible values are:

  • tinyint – Range 0 to 255
  • smallint – Range -32,768 to 32,767
  • int – Range -2,147,483,648 to 2,147,483,647
  • bigint – Range -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
  • decimal and numeric with a scale of 0.
  • Any user-defined data type (alias type) that is based on one of the allowed types.

The output of that column is described as:

The last value obligated. That is, the value returned from the most recent execution of the NEXT VALUE FOR function or the last value from executing the sp_sequence_get_range procedure. Returns the START WITH value if the sequence has never been used.

And this is where I have a problem with how it’s defined. When you have never retrieved a value from the sequence, there is no last value obligated. What it does return is the first value that will be generated, but has not yet been generated:

clip_image001

The documentation is correct but the behaviour is bizarre. I believe that this column should return NULL. Otherwise, there is no way to tell that this value has not yet been generated.

If I generate a new value and then query it again ie:

image

Note that the same value is returned:

clip_image001[1]

It’s only when I request it another time, that I see the expected value:

clip_image002

So the problem is that when you read the current value from the sys.sequences view, there’s no way to know if this is the last value obligated or the next one that will be obligated.

I’d really like to see this behaviour changed. Given that the SQL Server team rates backwards compatibility highly, an alternative would be to add a new column to sys.sequences that indicates that the sequence has never been used. There is a column is_exhausted. At a pinch, that could be set for new sequences.

Thoughts?

If you agree, you can vote here: https://connect.microsoft.com/SQLServer/feedback/details/1461552

Plan Cache Pollution: Avoiding it and Fixing it

While SQL Server’s plan cache generally is self-maintaining, poor application coding practices can cause the plan cache to become full of query plans that have only ever been used a single time and that are unlikely to ever be reused. We call this “plan cache pollution”.

Causes

The most common cause of these issues are programming libraries that send multiple variations of a single query. For example, imagine I have a query like:

SELECT c.CustomerID, c.TradingName, c.ContactName, c.PhoneNumber FROM dbo.Customers AS c WHERE c.CustomerID = @CustomerID AND c.BusinessCategory = @BusinessCategory AND c.ContactName LIKE @ContactNameSearch ORDER BY c.CustomerID;

The query has three parameters: @CustomerID, @BusinessCategory, and @ContactNameSearch. If the parameters are always defined with the same data types ie: @BusinessCategory is always nvarchar(35) and so on, then we will normally end up with a single query plan. However, if on one execution the parameter is defined as nvarchar(35), and on the next execution it is defined as nvarchar(20), and on yet another execution it is defined as nvarchar(15), each of these queries will end up with different query plans. A similar problem would also occur if any of the plan-affecting SET options are different on each execution ie: if DATEFORMAT was dmy for one execution, and mdy for the next, you’ll also end up with a different plan.

For more details on the internal causes of this or for a list of plan-affecting SET options, you might want to read the whitepaper that I prepared for the MSDN site. The latest version was for SQL Server 2012 and can be found here: https://msdn.microsoft.com/en-us/library/dn148262.aspx (Plan Caching and Recompilation in SQL Server 2012).

So what on earth would cause someone to send parameters defined differently each time? The worst offenders are not queries that are written intentionally, they are queries written by frameworks.

As an example, while using the SqlCommand object in ADO.NET, it is convenient to use the AddWithValue(parametername, parametervalue) method of the Parameters collection. But notice that when you do this, you do not specify the data type of the parameter. ADO.NET has to derive an appropriate data type based on the data that you have provided. For string parameters, this can be particularly troubling. If the parameter value is initially “hello”, a query plan with an nvarchar parameter length of 5 will be cached after the command is executed. When the query is re-executed with a parameter value of “trouble”, the command will appear to be different as it has an nvarchar parameter with a length of 7.

The more the command is executed, the more the plan cache will become full of plans for different length string parameters. This is particularly troubling for commands with multiple string parameters as plans will end up being stored for all combinations of all lengths of all the parameters. Some later variants of these libraries are improved by always deriving strings as nvarchar(4000). That’s not ideal but it’s much better than the previous mechanism.

While someone coding with ADO.NET can use another method to add a parameter ie: one that allows specifying the data type as well, developers using higher level libraries do not have that option. For example, Lync to SQL uses AddWithValue() within the framework. The user has no control over that. Ad-hoc queries generated by end-user query tools can also cause a similar problem where many combinations of similar queries can end up becoming cached.

Avoiding Plan Cache Pollution

As mentioned, to work around such a problem, the application should use a method to add the parameter that allows specifying the data type precisely.

As an example, nvarchar(100) might be used as the data type for each execution in the above example, if we know that all possible parameter lengths are less than 100.

Treating Plan Cache Pollution

There are several additional options that can help in dealing with plan cache pollution issues:

FORCED PARAMETERIZATION

FORCED PARAMETERIZATION can be set at the database level. SQL Server will often auto-parameterize queries by determining that a value looks like a parameter, even though you didn’t specify it as a parameter. Using the FORCED PARAMETERIZATION setting makes SQL Server become much more aggressive in deciding which queries to auto-parameterize. The down-side of this option is that it could potentially introduce parameter-sensitivity problems. (This option was added in SQL Server 2005).

OPTIMIZE FOR ADHOC WORKLOADS

OPTIMIZE FOR ADHOC WORKLOADS is an sp_configure server level option. When set, SQL Server only caches a plan stub on the first execution of an ad-hoc query. The next time the same query is executed, the full plan is stored. Plan stubs are much smaller than query plans and this option ensures that the plan cache is not filled by query plans that have never been reused. (This option was added in SQL Server 2008). We tend to enable this option on most servers.

DBCC FREESYSTEMCACHE

Sometimes you can get into a situation where you simply cannot avoid the queries from creating this situation and you need to deal with it. DBCC FREESYSTEMCACHE can be used to clear the query cache. One little understood option on it however, is that you can then specify a particular Resource Governor resource pool. It then only clears the plans associated with that resource pool. (This command was first available in SQL Server 2005 but the option to clear a specific resource pool was added in SQL Server 2008).

We often use this method to work around plan cache pollution issues. We try to isolate the badly-behaved applications or ad-hoc queries into one or more separate resource pools using Resource Governor. Then periodically, (perhaps every 5 or 10 minutes), we clear the plan cache for members of this “tough luck” pool.

Best advice is to try to avoid the situation in the first place by appropriate coding techniques but that option isn’t available to everyone.