SQL Interview: 52 Deleting duplicated rows

This is a post in the SQL Interview series. These aren’t trick or gotcha questions, they’re just questions designed to scope out a candidate’s knowledge around SQL Server and Azure SQL Database.

Section: Administrator Level: Medium

Question:

You have an ETL process that loads employee details into a table named Staging.Employees. The table has the following columns:

EmployeeID (int)
FullName (nvarchar(100))
LoginName (varchar(100))

The table is a heap and has no primary key.

The ETL process accidentally loaded the employee details three times. You need to write code to remove the duplicated rows. What code should you use?

Answer:

While you could do this by selecting the distinct rows into another table, then truncating this table, and inserting the distinct rows back in, you can do it with a CTE quite easily:

WITH OrderedEmployees AS
(
    SELECT *, 
           ROW_NUMBER() 
           OVER 
           (
               PARTITION BY EmployeeID, FullName, LoginName
               ORDER BY (SELECT NULL)
           ) AS CopyNumber
    FROM Staging.Employees 
)

DELETE
FROM OrderedEmployees
WHERE CopyNumber > 1;

2025-06-01