My problem is that I want to delete duplicated records but keeps XX latest records. For example:
id | ean | price | price_type | country | valid_to |
---|---|---|---|---|---|
1 | 12345678 | 19.99 | b2c | US | 2022-03-30 |
2 | 12345678 | 18.99 | b2c | US | 2022-03-28 |
3 | 12345678 | 17.99 | b2c | US | 2022-03-26 |
4 | 11122233 | 146.99 | b2b | US | 2022-03-30 |
5 | 11122233 | 150.99 | b2b | US | 2022-03-28 |
6 | 11122233 | 170.99 | b2b | US | 2022-03-26 |
7 | 11122233 | 180.99 | b2b | US | 2022-04-01 |
Desired results - keeps 2 latest records:
id | ean | price | price_type | country | valid_to |
---|---|---|---|---|---|
1 | 12345678 | 19.99 | b2c | US | 2022-03-30 |
2 | 12345678 | 18.99 | b2c | US | 2022-03-28 |
4 | 11122233 | 146.99 | b2b | US | 2022-03-30 |
7 | 11122233 | 180.99 | b2b | US | 2022-04-01 |
Which query statement should I make to do this? Thank you!
CodePudding user response:
What is precise MySQL version? – Akina
MySQL 5.7 – nphuly
DELETE t0
FROM test t0
JOIN ( SELECT t1.id, COUNT(*) cnt
FROM test t1
JOIN test t2 ON t1.ean = t2.ean AND t1.valid_to <= t2.valid_to
GROUP BY t1.id ) t3 ON t0.id = t3.id
WHERE t3.cnt > 2
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=3284d0a012272813c1fbd6121bfd51b3
CodePudding user response:
it is support MSSQL 8.0 version or above
use ROW_NUMBER
with PARTITION and get your result
Get only 2 record of each ean
SELECT * FROM (
SELECT id,ean,price,price_type,country,valid_to,
ROW_NUMBER() OVER(PARTITION BY valid_to ORDER BY ean,valid_to) AS RNO
FROM tablename
)main
WHERE RNO < 3
if you want to delete then use this
DELETE FROM tablename WHERE id IN (
SELECT Id FROM (
SELECT id,
ROW_NUMBER() OVER(PARTITION BY valid_to ORDER BY ean,valid_to) AS RNO
FROM tablename
)main
WHERE RNO > 2
)