Home > Software engineering >  SQL query to remove duplicate records without primary key, keeping the most recent [duplicate]
SQL query to remove duplicate records without primary key, keeping the most recent [duplicate]

Time:09-23

I have a transaction table (SQL Express 2014) that hold sales transactions. I need to remove duplicate records, retaining only the most recent record

Example of current data

ACC_PART_MAT      TX_DATE     
A1025-A552        2021-09-02
A1025-B1994       2121-04-28
A1025-B1994       2121-09-02
A1025-B1994       2121-03-21
A1025-B1960       2121-05-20

End result required

ACC_PART_MAT      TX_DATE     
A1025-A552        2021-09-02
A1025-B1994       2121-09-02
A1025-B1960       2121-05-20

There are many examples addressing duplicate records but I cannot get them work with no primary key and dates. Many thanks in advance

CodePudding user response:

For your example, you can just use aggregation:

select ACC_PART_MAT, min(TX_DATE) as TX_DATE
from t
group by ACC_PART_MAT;

If you actually wanted to delete rows from a table, you can use an updatable CTE -- but be careful because this changes the table:

with todelete as (
      select t.*,
             row_number() over (partition by ACC_PART_MAT order by TX_DATE asc) as seqnum
      from t
     ) 
delete from todelete
    where seqnum > 1;
  • Related