Table Rebuild Avoids Excessive Lookups After Data Type Change

I’ve run into a situation at a number of sites where the following occurs

  • An excessive number of logical page reads during query execution
  • Changes have occurred from ntext, text, or image data types have been replaced by nvarchar(max), varchar(max), or varbinary(max) data types, as part of a clean-up of deprecated data types.
  • Rebuilding the table greatly reduces the number of page reads and the customer is puzzled about why.

One of the causes for this situation is related to how the data in these columns is stored. The ntext, text, and image data types defaulted to having their data stored out of row. The row contained a pointer to where the data was located. By comparison, the nvarchar(max), varchar(max), and varbinary(max) data types default to storing data in-row where possible.

The issue is that when the data type is changed, the data isn’t moved in-row and won’t move until the column is updated. Here’s an example:

Let’s start by creating the table:

clip_image002

Note that the RequestDetails column is of the ntext data type. It will default to being stored out of row. Let’s start by inserting some data into the table:

clip_image004

If we query the allocation units that have been created for this table, we see the following:

clip_image006

Note that there are both IN_ROW_DATA and LOB_DATA allocation units and both have been used.

For a comparison, let’s create another table dbo.RequestsX that uses nvarchar(max) instead and see the difference:

clip_image008

Note that although there are LOB_DATA and ROW_OVERFLOW_DATA allocation units, no pages have been allocated to either of those.

Now, let’s go back to the original table and change the data type of the RequestDetails column to nvarchar(max) and see the difference:

clip_image010

Note that the ROW_OVERFLOW_DATA allocation unit has appeared but isn’t used. Also the original two allocation units are unchanged in how much they are used. The bottom line is that the data hasn’t moved yet.

Let’s now update the column by setting it to its own value and compare the outcome:

clip_image012

Notice that the data has basically moved in-row now but we now have some fragmentation.

So just for completeness, let’s rebuild the table completely and compare the outcome:

clip_image014

So the important message is that if you go through your code and dutifully replace all the ntext, text, and image data types with nvarchar(max), varchar(max), and varbinary(max) data types, you’ll need to rebuild the table to get the best results.

2014-11-17