SQL: Rebuild clustered columnstore indexes when they require maintenance
For general index maintenance, our friend Old Hallengren has an awesome solution for most people: https://ola.hallengren.com/sql-server-index-and-statistics-maintenance.html
We started to have customers using clustered columnstore indexes from SQL Server 2014, and they needed to be treated differently. At the time, I checked Ola’s latest scripts to see what happens with columnstore indexes.
It appeared that the code ignored nonclustered columnstore indexes (ie: index type of 6), which made sense as we would potentially need to rebuild them whenever the data changes, and at the time, the table was read-only. So that made lots of sense. Fortunately, that limitation is now gone.
For other indexes (including clustered columnstore indexes which were updatable), the code used the same avg_fragmentation_in_percent values from sys.dm_db_index_physical_stats. Unfortunately, that view didn’t return any data for columnstore indexes of either type, so it wasn’t actually helpful.
Niko has an awesome series of blog posts about columnstore indexes that I encourage you to read: http://www.nikoport.com/columnstore/ In part 36, he discusses what he thinks is needed for maintenance. The metrics that I wanted to use overlapped the ones that he suggested in that article.
With a clustered columnstore index, when a row is deleted, a flag in a bitmap is updated to indicate the deletion. The row is still present in the columnstore row group. When an update occurs, the same mechanism flags the row as deleted, then inserts the new version into the delta store.
The main problems that occur are when many rows have been deleted.
In the code below, I’ve provided an implementation that will rebuild clustered columnstore indexes in a database when one of three criteria is met:
- The overall percentage of deleted rows has exceeded a cutoff (I’ve suggested a 10% default)
- The percentage of rows in any individual segment exceeds a cutoff (I’ve suggested a 20% default)
- The number of segments that are completely empty (all rows have been deleted) has exceeded a cutoff (I’ve suggested none)
This would work for both COLUMNSTORE and COLUMNSTORE_ARCHIVE compression.
Rebuild or reorganize
The first version of this script that I published in 2014 could be used with clustered columnstore indexes to try to determine when they should be rebuilt. Reorganize is not a very useful operation for these indexes as it basically just forces a close of the existing delta store, ready for compression.
One of the problems with a clustered columnstore index is that it doesn’t work as well if the segments have been created at less than the maximum potential size of 1048576 rows per segment. Over use of reorganize options could cause this, but more likely it will be to do with loading the index with small batches of data, rather than loading large batches in BULK INSERT operations.
The first update I did to the maintenance script below was so that you could also choose to rebuild if your segments are much less full than their optimal size ie: you could have 1,048,560,000 rows of data in 10,000 segments. So if you say 90 percent as the cutoff and your segments hold less than 90 percent of their potential, that will also trigger a rebuild.
Niko’s update
After I published the scripts, Niko Neugebauer sent me a suggested update to the latest version of those scripts and it makes sense. The change will help to filter out row groups that are not compressed and Niko noted that as Microsoft have started to include more objects (like Tombstone values) into sys.column_store_row_groups in Azure SQLDatabase, he believes that will help to prevent future SQL versions having a problem with the scripts.
Partition handling
One thing that wasn’t covered by the previous versions of the script was the ability to only individual index partitions. I then provided an updated version of the script that will rebuild only affected partitions if the clustered columnstore index is partitioned.
Updated script version
Version 2.4 of the scripts is below. And it brings together the previous ideas into one place. Hope it helps someone.
----------------------------------------------------------------------
-- Rebuild clustered columnstore indexes when necessary
--
-- Dr Greg Low v2.4
----------------------------------------------------------------------
-- rebuild when more than supplied % of rows have been deleted
-- rebuild when any segments contain more than supplied % deleted rows
-- rebuild if more than supplied number of segments are empty
-- rebuild when segments are on average less than the supplied % of
-- maximum capacity
DECLARE @DeletedTotalRowPercentage int = 10;
DECLARE @DeletedSegmentsRowPercentage int = 20;
DECLARE @EmptySegmentsAllowed int = 0;
DECLARE @SegmentPotentialFullnessPercentage int = 90;
----------------------------------------------------------------------
DECLARE @IndexesToRebuild TABLE
(
SchemaName sysname,
TableName sysname,
IndexName sysname,
PartitionNumber int,
IsPartitioned bit
);
SET NOCOUNT ON;
WITH ClusteredColumnstoreIndexes
AS
(
SELECT t.object_id AS ObjectID,
SCHEMA_NAME(t.schema_id) AS SchemaName,
t.name AS TableName,
i.name AS IndexName
FROM sys.indexes AS i
INNER JOIN sys.tables AS t
ON t.object_id = i.object_id
WHERE i.type = 5
),
RowGroups
AS
(
SELECT csrg.object_id AS ObjectID,
csrg.partition_number AS PartitionNumber,
csrg.total_rows AS TotalRows,
csrg.deleted_rows AS DeletedRows,
csrg.deleted_rows * 100.0 / csrg.total_rows AS DeletedPercentage,
CASE WHEN csrg.total_rows = csrg.deleted_rows
THEN 1 ELSE 0
END AS IsEmptySegment
FROM sys.column_store_row_groups AS csrg
WHERE csrg.state = 3 -- Compressed
-- (Ignoring: 0 - Hidden, 1 - Open, 2 - Closed, 4 - Tombstone)
),
IndexPartitionStats
AS
(
SELECT cci.ObjectID,
cci.SchemaName,
cci.TableName,
cci.IndexName,
rg.PartitionNumber,
SUM(CAST(rg.TotalRows AS decimal(18,0))) AS TotalRows,
SUM(CAST(rg.DeletedRows AS decimal(18,0))) AS DeletedRows,
SUM(CAST(rg.DeletedRows AS decimal(18,0))) * 100.0
/ SUM(CAST(rg.TotalRows AS decimal(18,0)))
AS DeletedPercentage,
SUM(rg.IsEmptySegment) aS EmptySegments,
COUNT(rg.TotalRows) AS TotalSegments
FROM ClusteredColumnstoreIndexes AS cci
INNER JOIN RowGroups AS rg
ON rg.ObjectID = cci.ObjectID
GROUP BY cci.ObjectID, cci.SchemaName, cci.TableName, cci.IndexName, rg.PartitionNumber
)
INSERT @IndexesToRebuild (SchemaName, TableName, IndexName, PartitionNumber, IsPartitioned)
SELECT s.SchemaName,
s.TableName,
s.IndexName,
s.PartitionNumber,
CASE WHEN EXISTS(SELECT 1
FROM IndexPartitionStats AS ps
WHERE s.SchemaName = ps.SchemaName
AND s.TableName = ps.TableName
AND s.IndexName = ps.IndexName
AND ps.PartitionNumber > 1)
THEN 1
ELSE 0
END
FROM IndexPartitionStats AS s
WHERE s.DeletedPercentage > @DeletedTotalRowPercentage
OR s.EmptySegments > @EmptySegmentsAllowed
OR EXISTS(SELECT 1 FROM RowGroups AS rg
WHERE rg.ObjectID = s.ObjectID
AND rg.DeletedPercentage > @DeletedSegmentsRowPercentage)
OR (s.TotalRows * 100 / (s.TotalSegments * 1048576.0))
< @SegmentPotentialFullnessPercentage;
DECLARE @SchemaName sysname;
DECLARE @TableName sysname;
DECLARE @IndexName sysname;
DECLARE @PartitionNumber int;
DECLARE @IsPartitioned bit;
DECLARE @SqlCommand nvarchar(max);
DECLARE IndexList CURSOR FAST_FORWARD READ_ONLY
FOR
SELECT SchemaName, TableName, IndexName, PartitionNumber, IsPartitioned
FROM @IndexesToRebuild
ORDER BY SchemaName, TableName, IndexName, PartitionNumber;
OPEN IndexList;
FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName, @PartitionNumber, @IsPartitioned;
WHILE @@FETCH_STATUS = 0
BEGIN
SET @SqlCommand = N'ALTER INDEX ' + QUOTENAME(@IndexName)
+ N' ON ' + QUOTENAME(@SchemaName)
+ N'.' + QUOTENAME(@TableName)
+ N' REBUILD'
+ CASE WHEN @IsPartitioned <> 0
THEN N' PARTITION = ' + CAST(@PartitionNumber AS nvarchar(10))
ELSE N''
END
+ ';';
PRINT @SqlCommand;
EXEC(@SqlCommand);
FETCH NEXT FROM IndexList INTO @SchemaName, @TableName, @IndexName, @PartitionNumber, @IsPartitioned;
END;
CLOSE IndexList;
DEALLOCATE IndexList;
2026-06-21