I have to figured out how get only the rows within duplicated values but differents timestamps. Each timestamp should be between -1/ 1 minutes of diff at maximum.
id | client_id | amount | token | sales_ref_id | updated_at |
---|---|---|---|---|---|
1 | 29238 | 6.333 | a$idjqb5AUdnh34 | 380 | 2020-11-17 18:45 |
2 | 2008 | 200 | kla3@nans#9aisD | 9 | 2020-11-17 18:45 |
3 | 2008 | 200 | 4843$a1s00c4544 | 9 | 2020-11-17 18:46 |
4 | 2008 | -200 | va$ie32u*Asc2001 | 9 | 2020-11-17 18:48 |
5 | 1800 | 6.333 | a$id4843$0c45c44 | 19 | 2020-11-17 19:44 |
Like the data above, I should get only the rows #2 and #3, because they have the same client, amount, sales_ref_id and only have 1 minutes plus or less of difference.
But I dont have any clue how should I start to a close solution.
CodePudding user response:
Yo can use EXISTS
:
SELECT t1.*
FROM tablename t1
WHERE EXISTS (
SELECT 1
FROM tablename t2
WHERE t2.id <> t1.id
AND (t2.client_id, t2.amount, t2.sales_ref_id) = (t1.client_id, t1.amount, t1.sales_ref_id)
AND ABS(strftime('%s', t2.updated_at) - strftime('%s', t1.updated_at)) <= 60
);
See the demo.
CodePudding user response:
Do you want both rows returned, or do you simply need to identify duplicates?
SELECT client_id, amount, sales_ref_id, MIN(updated_at), MAX(updated_at)
FROM YourTable
GROUP BY client_id, amount, sales_ref_id
HAVING COUNT(*) > 1 AND DATEDIFF(MI, MIN(updated_at), MAX(updated_at)) <= 1