SQL: Adding time to Change Data Capture (CDC) Events

Several times now on forums, I’ve seen questions about Change Data Capture (CDC). People like the way they can retrieve details about changes that have occurred (often to trickle-feed into a data warehouse), but they are puzzled why CDC doesn’t tell them when the event occurred. There’s an easy fix for that.

Let’s start by doing a quick CDC setup:

DROP DATABASE IF EXISTS CDC;
GO

CREATE DATABASE CDC;
GO

USE CDC;
GO

CREATE TABLE dbo.NewEmployees
( 
    NewEmployeeID int IDENTITY(1,1) PRIMARY KEY CLUSTERED,
    FullName nvarchar(100)
);
GO

-- note new last few columns in:

SELECT * FROM sys.databases WHERE name = N'CDC';

-- enable cdc at the database level

EXEC sys.sp_cdc_enable_db;

-- enable cdc for the table - requires SQL Agent running

EXEC sys.sp_cdc_enable_table
  @source_schema = 'dbo',
  @source_name = 'NewEmployees',
  @supports_net_changes = 1,
  @role_name = NULL;

-- review the state of CDC

SELECT name, is_tracked_by_cdc FROM sys.tables;
GO

And let’s then add three rows of data, then change one of them:

INSERT dbo.NewEmployees 
(
    FullName
)
VALUES('Fred Jones'),('Steve Ballmer'),('Bill Gates');

UPDATE dbo.NewEmployees SET FullName = 'Greg Low' 
    WHERE NewEmployeeID = 1;
GO

There are two ways we can see the changes:

DECLARE @From_LSN binary(10) = sys.fn_cdc_get_min_lsn('dbo_NewEmployees');
DECLARE @To_LSN binary(10) = sys.fn_cdc_get_max_lsn();

SELECT c.* 
FROM cdc.fn_cdc_get_all_changes_dbo_NewEmployees
         (@From_LSN, @To_LSN, 'all') AS c;
SELECT c.* 
FROM cdc.fn_cdc_get_net_changes_dbo_NewEmployees
         (@From_LSN, @To_LSN, 'all') AS c;
GO

That returns the following:

The first SELECT returns all changes as they occur, and the second SELECT returns the net affect of those changes. Because NewEmployeeID #1 was inserted as Fred Jones, and later changed to Greg Low, the net affect is just of an insert as Greg Low.

But notice, there’s no time value. What we can use to get the time is the LSN (log sequence number) by calling the sys.fn_cdc_map_lsn_to_time() function like this:

DECLARE @From_LSN binary(10) = sys.fn_cdc_get_min_lsn('dbo_NewEmployees');
DECLARE @To_LSN binary(10) = sys.fn_cdc_get_max_lsn();

SELECT c.*, sys.fn_cdc_map_lsn_to_time(c.__$start_lsn) AS ChangeDateTime 
FROM cdc.fn_cdc_get_all_changes_dbo_NewEmployees
         (@From_LSN, @To_LSN, 'all') AS c;

SELECT c.*, sys.fn_cdc_map_lsn_to_time(c.__$start_lsn) AS ChangeDateTime 
FROM cdc.fn_cdc_get_net_changes_dbo_NewEmployees
         (@From_LSN, @To_LSN, 'all') AS c;
GO

Now that returns:

You can see that the function returns us the time. The time is in local time for the server.

If you want to tear the example back down, just do this:

EXEC sys.sp_cdc_disable_db;
GO

DROP TABLE IF EXISTS dbo.NewEmployee;
GO

USE master;
GO

ALTER DATABASE CDC SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO

DROP DATABASE IF EXISTS CDC;
GO

2020-05-14