I have a transaction table (SQL Express 2014) that hold sales transactions. I need to remove duplicate records, retaining only the most recent record
Example of current data
ACC_PART_MAT TX_DATE
A1025-A552 2021-09-02
A1025-B1994 2121-04-28
A1025-B1994 2121-09-02
A1025-B1994 2121-03-21
A1025-B1960 2121-05-20
End result required
ACC_PART_MAT TX_DATE
A1025-A552 2021-09-02
A1025-B1994 2121-09-02
A1025-B1960 2121-05-20
There are many examples addressing duplicate records but I cannot get them work with no primary key and dates. Many thanks in advance
CodePudding user response:
For your example, you can just use aggregation:
select ACC_PART_MAT, min(TX_DATE) as TX_DATE
from t
group by ACC_PART_MAT;
If you actually wanted to delete rows from a table, you can use an updatable CTE -- but be careful because this changes the table:
with todelete as (
select t.*,
row_number() over (partition by ACC_PART_MAT order by TX_DATE asc) as seqnum
from t
)
delete from todelete
where seqnum > 1;