SQL: Using Optimize For Adhoc Workloads to Avoid Plan Cache Pollution

SQL: Using Optimize For Adhoc Workloads to Avoid Plan Cache Pollution

Memory is a precious resource on most systems today. SQL Server loves memory, so it’s important to avoid wasting it.

One common cause of wasted memory is what we call plan cache pollution. The plan cache is the region where SQL Server stores compiled query plans (and query plan stubs).

There are some nice new DMVs that give you details of what’s in that cache but nothing still compares to one of the older compatibility views sys.syscacheobjects. You can use it to see what’s in your plan cache:

You’ll see the different types of objects that are contained there but the one that’s of interest to us today is the Compiled Plan.

In the image above, I don’t know without looking further if the first three queries shown are actually the same query but they nicely demonstrate one of the issues. Because these plans are looked up via hashes of the queries, it’s really important that every time you call T-SQL code, that you use consistent calling. Note above that DB_ID() is capitalized in the first row, and now in the next two. Also notice that there is an extra space after db_id() in the third row. All it takes to get different query plans is for anything to be different. Whitespace, capitalization, etc. all matter, even on case-insensitive servers.

What causes the biggest issue with query plans though is how parameters are defined. One big issue with many frameworks is that they don’t define parameters correctly. This was a particular problem with Linq to SQL where under the covers it called the AddWithValue() method on the SqlCommand.Parameters collection. The problem with that method is it did not define the data type, and Linq to SQL had to derive the data type. What it did was to change the data type based on the length of strings. So “hello” was an nvarchar(5), “freddie” was an nvarchar(7) and so on.

What this meant is that your SQL Server plan cache could quickly become full of query plans for every combination of every length of parameter that was ever passed to it. In some later variations of this and other frameworks, they just now assign all strings to big buckets like nvarchar(4000) to try to avoid this problem.

It’s one of the reasons that I’m not a fan of ORMs. When you find a problem with something like this, what do you do?

Well what Microsoft did, after they actually caused this problem, was to add an interesting option to SQL Server.

EXEC sp_configure ‘optimize for adhoc workloads’, 1; GO

RECONFIGURE; GO

Once this configuration option is enabled, SQL Server doesn’t store a query plan the first time it encounters a particular query. It just stores a small stub so it can remember that it has seen the query before. Then the second time it sees a query, it stores the plan.

I believe this option really should be on for most systems. Yes, it slightly increases the amount of compilation that could occur but it avoids a truly polluted plan cache.

2017-11-20