There are a ton of questions and answeres out there regarding this question, but none is working for my special problem:
I have a table parts
with columns Number
and Version
.
Number Version
A123 0
A123 1
A123 2
B123 0
C213 0
C123 1
... ...
I want to remove all duplicate Numbers but keep the ones with the max Version.
I created this code:
DELETE FROM `parts`
WHERE Version NOT IN
(SELECT * FROM
(
SELECT MAX(Version)
FROM `parts`
GROUP BY Number
) AS duplicates
);
But this has a unexpected behaviour and is not removing all duplicates.
Here is a fiddle: http://sqlfiddle.com/#!9/4cc59e0/1
CodePudding user response:
You can try to use DELETE
with JOIN
which using a subquery to keep MAX
version each Number
DELETE p1
FROM parts p1
INNER JOIN
(
SELECT MAX(pp.Version) m_version,pp.Number
FROM parts pp
GROUP BY pp.Number
) p2 ON p2.Number = p1.Number AND m_version > p1.Version
CodePudding user response:
I would use a delete anti-join here:
DELETE p1
FROM parts p1
LEFT JOIN
(
SELECT Number, MAX(Version) AS MaxVersion
FROM parts
GROUP BY Number
) p2
ON p2.Number = p1.Number
WHERE
p2.Number IS NULL;
The WHERE
condition in the above delete join will be true for those part numbers which did not join to the max version for that part.
CodePudding user response:
Try exclude possible Version NULL values:
DELETE FROM `parts`
WHERE Version NOT IN
(SELECT * FROM
(
SELECT MAX(Version)
FROM `parts`
WHERE Version IS NOT NULL
GROUP BY Number
) AS duplicates
);
The first SELECT return empty when there is NULL values from the inner SELECT.