Remove partially duplicated rows in SQL Server Express-CodePudding

Got a table with several columns, with date at the end. It looks like:

Col1   Col2   Col3   date
----- ------ ------ ------------
  x     y      z     2022-10-01
  x     y      z     2022-10-10
  a     b      c     2022-10-01
  a     b      b     2022-10-10
  w     u      c     2022-10-15

What I'm trying to do is remove duplicates based on first three columns. With latest date left in column 4.

Tried to list it with:

Select col1, col2, col3, count(*) as counter 
from database
group by col1, col2, col3, date 
having count (*) > 1;

It's not working because it counts each row, including different dates.. Haven't found any other clues

Expected output is:

Col1	Col2	Col3	Date
x	y	z	2022-10-10
a	b	c	2022-10-10
w	u	c	2022-10-15

CodePudding user response：

You can use a common table expression and ROW_NUMBER to achieve this:

WITH cte 
AS
(
SELECT col1, col2, col3, ROW_NUMBER() OVER (PARTITION BY col1, col2, col3 ORDER BY date) as rn
from database
)
DELETE cte
WHERE rn>1;

CodePudding user response：

Just aggregate by the first 3 columns and take the max of the fourth:

SELECT col1, col2, col3, MAX(date) AS date
FROM yourTable
GROUP BY col1, col2, col3;