SQL: Text vs Varchar(max) - in row vs out of row storage

There was an interesting discussion on a local mailing list about the performance of a table that was just used for logging. One of the things that intrigued me was that the developer had used the text data type to store his log messages.
Now text, along with ntext and image, was deprecated back in 2005. Instead of text, ntext, and image, you should be using varchar(max), nvarchar(max), and varbinary(max).
The way the data is stored was also changed. The old data types used to store their data out of the table’s data rows, and just store pointers within the table’s data rows. This can be really slow. By comparison, the new types default to storing the data within the table’s data rows, and using overflow pages if need be. For many applications, this can be much faster. Simple writes to the end of a log file are a good example of that.
The problem with this is that you can’t just change the data type of a column from text to varchar(max). If you do that, the data won’t get moved in-row. You need to rebuild the table.
Let’s take a look at an example. I’ll start by creating two tables, one using text; the other using varchar(max):
Then we’ll populate both with 10,000 rows of data:
Finally, we’ll query to see where the data was allocated:
Note the one table is significantly smaller and only has in-row data.
Let’s try altering the column to change the data type:
And check again:
You can see that nothing has changed. Now let’s try rebuilding the table:
And still no prize. So we’ll do it the hard way:
And then we’re happy:
Note that even though it’s now defaulting to in-row data, if you do insert a value larger than 8000 characters, it’s going to use an overflow page for that data, and it will show as LOB_DATA. But the other rows won’t be using it.
2019-02-26