Home > Net >  How to delete all dulicate table excluding max value?
How to delete all dulicate table excluding max value?

Time:03-09

There are a ton of questions and answeres out there regarding this question, but none is working for my special problem:

I have a table parts with columns Number and Version.

Number Version
A123   0
A123   1
A123   2
B123   0
C213   0
C123   1
...    ...

I want to remove all duplicate Numbers but keep the ones with the max Version.

I created this code:

DELETE FROM `parts`
WHERE Version NOT IN
    (SELECT * FROM 
        (
            SELECT MAX(Version)
            FROM `parts`
            GROUP BY Number
        ) AS duplicates
);

But this has a unexpected behaviour and is not removing all duplicates.

Here is a fiddle: http://sqlfiddle.com/#!9/4cc59e0/1

CodePudding user response:

You can try to use DELETE with JOIN which using a subquery to keep MAX version each Number

DELETE p1
FROM parts p1
INNER JOIN
(
  SELECT MAX(pp.Version) m_version,pp.Number
  FROM parts pp
  GROUP BY pp.Number
) p2 ON p2.Number = p1.Number AND m_version > p1.Version

sqlfiddle

CodePudding user response:

I would use a delete anti-join here:

DELETE p1
FROM parts p1
LEFT JOIN
(
    SELECT Number, MAX(Version) AS MaxVersion
    FROM parts
    GROUP BY Number
) p2
    ON p2.Number = p1.Number
WHERE
    p2.Number IS NULL;

The WHERE condition in the above delete join will be true for those part numbers which did not join to the max version for that part.

CodePudding user response:

Try exclude possible Version NULL values:

DELETE FROM `parts`
WHERE Version NOT IN
   (SELECT * FROM 
        (
            SELECT MAX(Version)
            FROM `parts`
            WHERE Version IS NOT NULL
            GROUP BY Number
        ) AS duplicates
    );

The first SELECT return empty when there is NULL values from the inner SELECT.

  • Related