Home > Software design >  Get only rows when column is duplicated and timestamps get a difference between them
Get only rows when column is duplicated and timestamps get a difference between them

Time:03-08

I have to figured out how get only the rows within duplicated values but differents timestamps. Each timestamp should be between -1/ 1 minutes of diff at maximum.

id client_id amount token sales_ref_id updated_at
1 29238 6.333 a$idjqb5AUdnh34 380 2020-11-17 18:45
2 2008 200 kla3@nans#9aisD 9 2020-11-17 18:45
3 2008 200 4843$a1s00c4544 9 2020-11-17 18:46
4 2008 -200 va$ie32u*Asc2001 9 2020-11-17 18:48
5 1800 6.333 a$id4843$0c45c44 19 2020-11-17 19:44

Like the data above, I should get only the rows #2 and #3, because they have the same client, amount, sales_ref_id and only have 1 minutes plus or less of difference.

But I dont have any clue how should I start to a close solution.

CodePudding user response:

Yo can use EXISTS:

SELECT t1.*
FROM tablename t1
WHERE EXISTS (
  SELECT 1
  FROM tablename t2
  WHERE t2.id <> t1.id
    AND (t2.client_id, t2.amount, t2.sales_ref_id) = (t1.client_id, t1.amount, t1.sales_ref_id)
    AND ABS(strftime('%s', t2.updated_at) - strftime('%s', t1.updated_at)) <= 60
);

See the demo.

CodePudding user response:

Do you want both rows returned, or do you simply need to identify duplicates?

SELECT client_id, amount, sales_ref_id, MIN(updated_at), MAX(updated_at)
FROM YourTable
GROUP BY client_id, amount, sales_ref_id
HAVING COUNT(*) > 1 AND DATEDIFF(MI, MIN(updated_at), MAX(updated_at)) <= 1
  • Related