T-SQL ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) to Delete Duplicate Rows in SQL Table
If you have duplicate rows in your MS SQL Server database tables, you may want to delete duplicate records.
T-SQL Row_Number() function can help sql developers to solve this sql problem.
Before removing some of the rows in the SQL database table, you should decide which one of the row duplicates you will keep.
And you should decide which rows to delete.
You might want to keep the first entry, which can be assumed as the inserted date is earliest.
Or if your sql table has an identity column, let's say an Id column with data type integer (int).
You may want to keep the duplicate row with the smallest Id value, or with the biggest Id value.
So SQL Row_Number enables tsql developers rank records and table rows over desired columns and table fields even enables partitioning the result set based on table columns.
Before giving an sql example for deleting duplicate rows, let's create sql table and populate table with sample duplicate rows.
Now insert sample data into Suppliers table for removing duplicate records example.
Order Rows by Datetime Fields
Let's order table rows according to Last Contact Date within each Company record.
Now we can develop t-sql code to delete rows where RN is different than 1, which we assume duplicates since they are old records. We have a newer row in the sql table Suppliers.
This sql script uses the T-SQL Row_Number() function with Partition By option.
T-SQL developers will soon realize that they can use Partition Over clause in many cases to solve their problems.
Order Rows by Identity Field
For this example code, we will order sql table rows according to the Identity field column Id within each Company record.
And within each group that is formed by the Company field, we will only keep the row with biggest Identity field value (Id column value), and remove others by deleting them.
Summary - Delete Duplicate Rows in SQL Table
It is very easy to identify duplicate rows and delete duplicates due to your requirements by using a partition over clause, ROW_NUMBER() OVER (PARTITION BY columnname1, columnname2 ORDER BY columnname3 DESC) and using it within a CTE t-sql statement as shown in the above t-sql examples.
I hope you enjoy the sql examples and find them useful.
Additional Tutorials for Dublicate Rows
If you have identical rows, I mean all column values are identical, how can we identify the duplicate rows and delete duplicate records from database table using t-sql commands.
In this case there is still methods for sql administrators and t-sql developers to delete duplicate rows in sql tables.
You can find an other sql tutorial titled How to delete duplicate records or rows among identical rows in a table where no primary key exists among Kodyaz SQL articles.