Home > database >  Get the latest entry of each duplicate on Postgres
Get the latest entry of each duplicate on Postgres

Time:06-15

I am using Postgres v12, and I have a table with duplicated rows. I need to retrieve only the last entry for each duplicate, ignoring entries which have no duplicate.

This table has the following columns:

  • id (unique)
  • request_id (where to find the duplicates)
  • created_at (where to see which entry is the latest)
id request_id created_at
1 a 2020.06.06
2 a 2020.05.05
3 b 2020.04.04
4 b 2020.03.03
5 c 2020.04.04
6 c 2020.03.03
7 d 2020.03.03

The query should retrieve rows with id 1,3,5 , since they are the latest entry (created_at) of each duplicate. ID 7 has no duplicate, so it is ignored.

I have tried with the solution proposed here: https://www.geeksengine.com/article/get-single-record-from-duplicates.html but due to be using Postgres v12, those queries do not work, I get the error "column must appear in the group by clause" which is another problem cited here: must appear in the GROUP BY clause or be used in an aggregate function

I have been searching for a solution for days to this problem, but I am not an SQL expert. I would appreciate any help very much.

CodePudding user response:

here is one way using window functions :

select * from (
    select * 
        , row_number() over (partition by request_id order by created_at desc) as rn
        , count() over (partition by request_id) cn 
    from tablename
) t where cn > 1 and rn = 1
  • Related