Rebuild clustered columnstore indexes when they require maintenance

For general index maintenance, our friend Old Hallengren has an awesome solution for most people: https://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html

We’ve started to have customers using clustered columnstore indexes in SQL Server 2014, and they need to be treated differently. I checked Ola’s latest scripts today to see what happens with columnstore indexes. It appears that the code ignores nonclustered columnstore indexes (ie: index type of 6), which makes sense as we need to rebuild them whenever the data changes, and in the meantime, the table is read-only. So that makes lots of sense.

For other indexes (including clustered columnstore indexes which are updatable), the code uses the same avg_fragmentation_in_percent values from sys.dm_db_index_physical_stats. Unfortunately, that view doesn’t return any data for columnstore indexes of either type, so that’s not actually helpful.

Niko has an awesome series of blog posts about columnstore indexes that I encourage you to read: http://www.nikoport.com/columnstore/ In part 36, he discusses what he thinks is needed for maintenance. The metrics that I want to use overlap the ones that he’s suggested in that article.

With a clustered columnstore index, when a row is deleted, a flag in a bitmap is updated to indicate the deletion. The row is still present in the columnstore row group. When an update occurs, the same mechanism flags the row as deleted, then inserts the new version into the delta store.

The main problems that will occur are when many rows have been deleted.

In the code below, I’ve provided an implementation that will rebuild clustered columnstore indexes in a database when one of three criteria is met:

  • The overall percentage of deleted rows has exceeded a cutoff (I’ve suggested a 10% default)
  • The percentage of rows in any individual segment exceeds a cutoff (I’ve suggested a 20% default)
  • The number of segments that are completely empty (all rows have been deleted) has exceeded a cutoff (I’ve suggested none)

This would work for both COLUMNSTORE and COLUMNSTORE_ARCHIVE compression.

Hope it helps someone until Ola does his magic:

-———————————————————————

-- Rebuild clustered columnstore indexes when necessary

-- Dr Greg Low v1.0

-———————————————————————

-- rebuild when more than supplied % of rows have been deleted

-- rebuild when any segments contain more than supplied % deleted rows

-- rebuild if more than supplied number of segments are empty

DECLARE @DeletedTotalRowPercentage int = 10;   

DECLARE @DeletedSegmentsRowPercentage int = 20;

DECLARE @EmptySegmentsAllowed int = 0;         

-———————————————————————

DECLARE @IndexesToRebuild TABLE (SchemaName sysname,

                                 TableName sysname,

                                 IndexName sysname);

WITH ClusteredColumnstoreIndexes

AS

( SELECT t.object_id AS ObjectID,

         SCHEMA_NAME(t.schema_id) AS SchemaName,

         t.name AS TableName,

         i.name AS IndexName

  FROM sys.indexes AS i

  INNER JOIN sys.tables AS t

  ON i.object_id = t.object_id

  WHERE i.type = 5

),

RowGroups

AS

( SELECT csrg.object_id AS ObjectID,

         csrg.total_rows AS TotalRows,

         csrg.deleted_rows AS DeletedRows,

         csrg.deleted_rows * 100.0 / csrg.total_rows AS DeletedPercentage,

         CASE WHEN csrg.total_rows = csrg.deleted_rows

              THEN 1 ELSE 0

         END AS IsEmptySegment

  FROM sys.column_store_row_groups AS csrg

),

IndexStats

AS

( SELECT cci.ObjectID,

         cci.SchemaName,

         cci.TableName,

         cci.IndexName,

         SUM(rg.TotalRows) AS TotalRows,

         SUM(rg.DeletedRows) AS DeletedRows,

         SUM(rg.DeletedRows) * 100.0 / SUM(rg.TotalRows) AS DeletedPercentage,

         SUM(rg.IsEmptySegment) aS EmptySegments

  FROM ClusteredColumnstoreIndexes AS cci

  INNER JOIN RowGroups AS rg

  ON cci.ObjectID = rg.ObjectID

  GROUP BY cci.ObjectID, cci.SchemaName, cci.TableName, cci.IndexName

)

INSERT @IndexesToRebuild (SchemaName, TableName, IndexName)

SELECT s.SchemaName, s.TableName, s.IndexName

FROM IndexStats AS s

WHERE s.DeletedPercentage > @DeletedTotalRowPercentage

OR s.EmptySegments > @EmptySegmentsAllowed

OR EXISTS(SELECT 1 FROM RowGroups AS rg

                   WHERE rg.ObjectID = s.ObjectID

                   AND rg.DeletedPercentage > @DeletedSegmentsRowPercentage);

DECLARE @SchemaName sysname;

DECLARE @TableName sysname;

DECLARE @IndexName sysname;

DECLARE @SqlCommand nvarchar(max);

DECLARE IndexList CURSOR FAST_FORWARD READ_ONLY

FOR

  SELECT SchemaName, TableName, IndexName

  FROM @IndexesToRebuild

  ORDER BY SchemaName, TableName, IndexName;

OPEN IndexList;

FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName;

WHILE @@FETCH_STATUS = 0

BEGIN

  SET @SqlCommand = N’ALTER INDEX ’ + QUOTENAME(@IndexName)

                  + N’ ON ’ + QUOTENAME(@SchemaName)

                  + N’.’ + QUOTENAME(@TableName)

                  + N’ REBUILD PARTITION = ALL;';

  PRINT @SqlCommand;

  EXEC(@SqlCommand); 

  FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName;

END;

CLOSE IndexList;

DEALLOCATE IndexList;

2015-01-09