Home > front end >  SQL Server - How to remove duplicates within select query?
SQL Server - How to remove duplicates within select query?

Time:12-22

How do you delete duplicate records based on multiple paramaters from a select query?

I have a complex query that returns these, and I need to remove duplicates based on IdRow, IdAudience and IdAb, so in the case below I would need to remove Ids 1 and 3:

Id IdRow IdAudience IdAb Quantity
1 NULL NULL 2836 1
2 NULL NULL 2836 1
3 560540 NULL NULL 77
4 560540 NULL NULL 77

But how do I do this from a select query?

CodePudding user response:

You could use a simple aggregation approach:

SELECT MAX(Id) AS Id, IdRow, IdAudience, IdAb, Quantity
FROM yourTable
GROUP BY IdRow, IdAudience, IdAb, Quantity
ORDER BY 1;

This assumes that, given a set of 2 or more duplicates, you want to retain the single record with the max Id value.

CodePudding user response:

You could certainly use the ever popular ROW_NUMBER() function to filter out the duplicates as well:

;WITH CTE AS
(
    SELECT * , ROW_NUMBER() OVER(PARTITION BY IdRow, IdAudience,IdAb ORDER BY Id DESC) RN
    FROM yourTable
)

DELETE FROM CTE 
WHERE RN > 1

CodePudding user response:

You can write this in a subquery as below

delete from yourtable where Id not in 
(
SELECT top 1000 MAX(Id) AS Id
FROM yourtable
GROUP BY IdRow, IdAudience, IdAb, Quantity
ORDER BY Id
)
  • Related