How would I delete yesterday's duplicate data from a MySQL Table and keep latest record.
For example, with the following data:
SELECT * FROM data;
| pid | serviceid | id | created
| ----| ----------|----------------|---------
| 10 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34
| 11 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34
| 12 | KSPSLS | QLSPDF0S0SDFKK | 2022-06-15 11:44:21
| 13 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42
| 14 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42
I used
DELETE n1 FROM data n1, data n2 WHERE n1.pid < n2.pid AND n1.id = n2.id
How could I DELETE to only remove duplicates and keep latest pid for yesterday's data.
Expected output would be
SELECT * FROM data;
| pid | serviceid | id | created
| ----| ----------|----------------|---------
| 10 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34
| 11 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34
| 12 | KSPSLS | QLSPDF0S0SDFKK | 2022-06-15 11:44:21
| 14 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42
remove | 13 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42
CodePudding user response:
If you want to remove the duplicate data you can use this
SELECT DISTINCT * FROM data;
but if you want to choose the latest record from both of it you can use this
SELECT * FROM data WHERE [data] = (SELECT MAX([data.created]) FROM data) GO
CodePudding user response:
DELETE * FROM data WHERE id IN (SELECT id FROM data WHERE name="ABNCKN" LIMIT((SELECT COUNT(name) FROM data WHERE name="ABNCKN") - 1)));
SELECT id FROM data WHERE name="ABNCKN" >> this query will get all ids with specific name, and limit will limit the the number of rows returned. The number of rows will be dynamic based on the duplicated rows except one row. and you can add ORDER BY to the query to arrange them.
and finally delete all duplicated rows except one row.
I hope it works well
CodePudding user response:
Didn't understand very well why you keep some duplicated data in the expected output but what you can try is get the last Id with a JOIN and filter it from the WHERE so is not deleted:
DELETE d1 FROM data d1, data d2
JOIN (SELECT MAX(pid) pid FROM data) d3
WHERE d1.pid != d3.pid AND d1.pid < d2.pid AND d1.id = d2.id;
CodePudding user response:
You can achieve this using self join:
DELETE t1 FROM data t1
INNER JOIN data t2
WHERE t1.pid < t2.pid
AND t1.serviceid = t2.serviceid
AND t1.id = t2.id
AND t1.created = t2.created;
SELECT * FROM data;
See this db<>fiddle
For MySQL 8 you can also achieve this using ROW_NUMBER()
:
WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY serviceid,id,created ORDER BY pid DESC) AS RN
FROM data
)
DELETE data
FROM data
JOIN CTE ON data.pid=CTE.pid
WHERE CTE.RN>1;
SELECT * FROM data;
Check this db<>fiddle
Output after delete:
pid | serviceid | id | created |
---|---|---|---|
11 | ABNCKN | Q0CHZ2GI1VKOPL | 2022-06-14 10:49:34 |
12 | KSPSLS | QLSPDF0S0SDFKK | 2022-06-15 11:44:21 |
14 | AKNKSL | QLSPDF0S0SDFKK | 2022-06-15 12:51:42 |
As Nishant mentioned pid 10 should also be removed.