Home > Software engineering >  deleting specific duplicate and original entries in a table based on date
deleting specific duplicate and original entries in a table based on date

Time:04-29

i have a table called "main" which has 4 columns, ID, name, DateID and Sign.

i want to create a query that will delete entries in this table if there is the same ID record in twice within a certain DateID.

i have my where clause that searches the previous 3 weeks

 where DateID =((SELECT MAX( DateID)
                                WHERE DateID < ( SELECT MAX( DateID )-3))

e.g of my dataset im working with:

id name DateID sign
12345 Paul 1915 Up
23658 Danny 1915 Down
37868 Jake 1916 Up
37542 Elle 1917 Up
12345 Paul 1917 Down
87456 John 1918 Up
78563 Luke 1919 Up
23658 Danny 1920 Up

in the case above, both entries for ID 12345 would need to be removed. however the entries for ID 23658 would need to be kept as the DateID > 3

how would this be possible?

CodePudding user response:

You can use window functions for this.

It's not quite clear, but it seems LAG and conditional COUNT should fit what you need.

DELETE t
FROM (
    SELECT *,
      CountWithinDate = COUNT(CASE WHEN t.PrevDate >= t.DateId - 3 THEN 1 END) OVER (PARTITION BY t.id)
    FROM (
        SELECT *,
          PrevDate = LAG(t.DateID) OVER (PARTITION BY t.id ORDER BY t.DateID)
        FROM YourTable t
    ) t
) t
WHERE CountWithinDate > 0;

db<>fiddle

Note that you do not need to re-join the table, you can delete directly from the t derived table.

CodePudding user response:

Hope this works:

DELETE FROM test_tbl
WHERE id IN (
                SELECT T1.id
                FROM test_tbl T1
                WHERE EXISTS (SELECT 1 FROM test_tbl  T2 WHERE T1.id = T2.id AND ABS(T2.dateid - T1.dateid) < 3 AND T1.dateid <> T2.dateid)
            ) 

In case you need more logic for data processing, I would suggest using Stored Procedure.

  • Related