What I want to do is to set column Deleted
= 1 for all duplicates in table Customers
except of one. So that one entry remains.
Table:
| Deleted | ID | Number | Name |
-----------------------------------
| 0 | 2A3E | 100004 | Andy |
| 0 | 9P3W | 102506 | Judy |
| 0 | 3R4Z | 120097 | Freddy|
| 0 | 1G5Y | 103905 | Nikky |
| 0 | 2A3E | 103905 | Nikky |
| 0 | 9P3W | 147001 | Johnny|
| 0 | 5K7V | 120097 | Teddy |
| 0 | 6D8S | 120097 | Teddy |
Query:
WITH DataSource AS
(
SELECT *,
COUNT(*) OVER (PARTITION BY Number) count_calc
FROM Customers
)
SELECT *
FROM DataSource
WHERE count_calc > 1
Results:
| Deleted | ID | Number | Name |
-----------------------------------
| 0 | 1G5Y | 103905 | Nikky |
| 0 | 2A3E | 103905 | Nikky |
| 0 | 5K7V | 120097 | Teddy |
| 0 | 6D8S | 120097 | Teddy |
Now I want set Deleted
= 1 for only one Nikky and one Teddy.
Tried DISTINCT
and GROUP BY
but didn't work. How can I modify the query so that it returns only one of these duplicates from Nikky and Teddy and not both? But at the same time returns all columns so that I can set Deleted
of one to 1?
CodePudding user response:
You are part-way there, you need to use row_number
over a window, then you can use an updatable CTE:
with d as (
select * , Row_Number() over(partition by number order by id) rn
from customers
)
update d set deleted=1 where rn>1