deleting specific duplicate and original entries in a table based on date-CodePudding

i have a table called "main" which has 4 columns, ID, name, DateID and Sign.

i want to create a query that will delete entries in this table if there is the same ID record in twice within a certain DateID.

i have my where clause that searches the previous 3 weeks

 where DateID =((SELECT MAX( DateID)
                                WHERE DateID < ( SELECT MAX( DateID )-3))

e.g of my dataset im working with:

id	name	DateID	sign
12345	Paul	1915	Up
23658	Danny	1915	Down
37868	Jake	1916	Up
37542	Elle	1917	Up
12345	Paul	1917	Down
87456	John	1918	Up
78563	Luke	1919	Up
23658	Danny	1920	Up

in the case above, both entries for ID 12345 would need to be removed. however the entries for ID 23658 would need to be kept as the DateID > 3

how would this be possible?

CodePudding user response：

You can use window functions for this.

It's not quite clear, but it seems LAG and conditional COUNT should fit what you need.

DELETE t
FROM (
    SELECT *,
      CountWithinDate = COUNT(CASE WHEN t.PrevDate >= t.DateId - 3 THEN 1 END) OVER (PARTITION BY t.id)
    FROM (
        SELECT *,
          PrevDate = LAG(t.DateID) OVER (PARTITION BY t.id ORDER BY t.DateID)
        FROM YourTable t
    ) t
) t
WHERE CountWithinDate > 0;

db<>fiddle

Note that you do not need to re-join the table, you can delete directly from the t derived table.

CodePudding user response：

Hope this works:

DELETE FROM test_tbl
WHERE id IN (
                SELECT T1.id
                FROM test_tbl T1
                WHERE EXISTS (SELECT 1 FROM test_tbl  T2 WHERE T1.id = T2.id AND ABS(T2.dateid - T1.dateid) < 3 AND T1.dateid <> T2.dateid)
            )

In case you need more logic for data processing, I would suggest using Stored Procedure.