Got a table with several columns, with date at the end. It looks like:
Col1 Col2 Col3 date
----- ------ ------ ------------
x y z 2022-10-01
x y z 2022-10-10
a b c 2022-10-01
a b b 2022-10-10
w u c 2022-10-15
What I'm trying to do is remove duplicates based on first three columns. With latest date left in column 4.
Tried to list it with:
Select col1, col2, col3, count(*) as counter
from database
group by col1, col2, col3, date
having count (*) > 1;
It's not working because it counts each row, including different dates.. Haven't found any other clues
Expected output is:
Col1 | Col2 | Col3 | Date |
---|---|---|---|
x | y | z | 2022-10-10 |
a | b | c | 2022-10-10 |
w | u | c | 2022-10-15 |
CodePudding user response:
You can use a common table expression and ROW_NUMBER
to achieve this:
WITH cte
AS
(
SELECT col1, col2, col3, ROW_NUMBER() OVER (PARTITION BY col1, col2, col3 ORDER BY date) as rn
from database
)
DELETE cte
WHERE rn>1;
CodePudding user response:
Just aggregate by the first 3 columns and take the max of the fourth:
SELECT col1, col2, col3, MAX(date) AS date
FROM yourTable
GROUP BY col1, col2, col3;