T-SQL ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) to Delete Duplicate Rows in SQL Table

If you have duplicate rows in your MS SQL Server database tables, you may want to delete duplicate records.
T-SQL Row_Number() function can help sql developers to solve this sql problem.
Before removing some of the rows in the SQL database table, you should decide which one of the row duplicates you will keep. And you should decide which rows to delete.

You might want to keep the first entry, which can be assumed as the inserted date is earliest.
Or if your sql table has an identity column, let's say an Id column with data type integer (int).
You may want to keep the duplicate row with the smallest Id value, or with the biggest Id value.
So SQL Row_Number enables tsql developers rank records and table rows over desired columns and table fields even enables partitioning the result set based on table columns.

Before giving an sql example for deleting duplicate rows, let's create sql table and populate table with sample duplicate rows.


CREATE TABLE Suppliers

(

  Id int identity (1,1),

  CompanyTitle nvarchar(1000),

  ContactName nvarchar(100),

  LastContactDate datetime

)

Code

Now insert sample data into Suppliers table for removing duplicate records example.


INSERT Suppliers VALUES (N'Melody Music Instruments',N'James Manning', '20090623 10:15')

INSERT Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090720 15:40')

INSERT Suppliers VALUES (N'Top Music',N'Katy Swan', '20090827 18:00')

INSERT Suppliers VALUES (N'Blue Jazz',N'Mike Clark', '20090806 10:00')

INSERT Suppliers VALUES (N'Melody Music Instruments',N'James Brown', '20080121 11:20')

INSERT Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00')

INSERT Suppliers VALUES (N'Top Music',N'Katy Perry', '20090825 14:00')

Code

Order Rows by Datetime Fields

Let's order table rows according to Last Contact Date within each Company record.


SELECT

  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC),

  Id, CompanyTitle, ContactName, LastContactDate 

FROM Suppliers

Code

Now we can develop t-sql code to delete rows where RN is different than 1, which we assume duplicates since they are old records. We have a newer row in the sql table Suppliers.
This sql script uses the T-SQL Row_Number() function with Partition By option.
T-SQL developers will soon realize that they can use Partition Over clause in many cases to solve their problems.


WITH [CTE DUPLICATE] AS 

(

SELECT 

  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY LastContactDate DESC),

  Id, CompanyTitle, ContactName, LastContactDate 

FROM Suppliers

)

DELETE FROM [CTE DUPLICATE] WHERE RN > 1

Code

Order Rows by Identity Field

For this example code, we will order sql table rows according to the Identity field column Id within each Company record.


SELECT

  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY Id DESC),

  Id, CompanyTitle, ContactName, LastContactDate 

FROM Suppliers

Code

And within each group that is formed by the Company field, we will only keep the row with biggest Identity field value (Id column value), and remove others by deleting them.


WITH [CTE DUPLICATE] AS 

(

SELECT 

  RN = ROW_NUMBER() OVER (PARTITION BY CompanyTitle ORDER BY Id DESC),

  Id, CompanyTitle, ContactName, LastContactDate 

FROM Suppliers

)

DELETE FROM [CTE DUPLICATE] WHERE RN > 1

Code

Summary - Delete Duplicate Rows in SQL Table

It is very easy to identify duplicate rows and delete duplicates due to your requirements by using a partition over clause, ROW_NUMBER() OVER (PARTITION BY columnname1, columnname2 ORDER BY columnname3 DESC) and using it within a CTE t-sql statement as shown in the above t-sql examples.
I hope you enjoy the sql examples and find them useful.

Additional Tutorials for Dublicate Rows

If you have identical rows, I mean all column values are identical, how can we identify the duplicate rows and delete duplicate records from database table using t-sql commands. In this case there is still methods for sql administrators and t-sql developers to delete duplicate rows in sql tables. You can find an other sql tutorial titled How to delete duplicate records or rows among identical rows in a table where no primary key exists among Kodyaz SQL articles.

SQL Server

SQL Server 2019 Installation
download SQL Server 2019
download SQL Server 2017
download SQL Server 2016
download SQL Server 2014
download SQL Server 2012
MacOS ve SQL Server 2019

SQL Resources

SQL Server Tutorials
SQL Server 2012
SQL Server Tools
SQL Server SSIS