How do you delete duplicate records based on multiple paramaters from a select query?
I have a complex query that returns these, and I need to remove duplicates based on IdRow, IdAudience and IdAb, so in the case below I would need to remove Ids 1 and 3:
Id | IdRow | IdAudience | IdAb | Quantity |
---|---|---|---|---|
1 | NULL | NULL | 2836 | 1 |
2 | NULL | NULL | 2836 | 1 |
3 | 560540 | NULL | NULL | 77 |
4 | 560540 | NULL | NULL | 77 |
But how do I do this from a select query?
CodePudding user response:
You could use a simple aggregation approach:
SELECT MAX(Id) AS Id, IdRow, IdAudience, IdAb, Quantity
FROM yourTable
GROUP BY IdRow, IdAudience, IdAb, Quantity
ORDER BY 1;
This assumes that, given a set of 2 or more duplicates, you want to retain the single record with the max Id
value.
CodePudding user response:
You could certainly use the ever popular ROW_NUMBER() function to filter out the duplicates as well:
;WITH CTE AS
(
SELECT * , ROW_NUMBER() OVER(PARTITION BY IdRow, IdAudience,IdAb ORDER BY Id DESC) RN
FROM yourTable
)
DELETE FROM CTE
WHERE RN > 1
CodePudding user response:
You can write this in a subquery as below
delete from yourtable where Id not in
(
SELECT top 1000 MAX(Id) AS Id
FROM yourtable
GROUP BY IdRow, IdAudience, IdAb, Quantity
ORDER BY Id
)