Home > Back-end >  Select or update only one duplicate from multiple rows
Select or update only one duplicate from multiple rows

Time:10-15

What I want to do is to set column Deleted = 1 for all duplicates in table Customers except of one. So that one entry remains.

Table:

| Deleted | ID   | Number | Name  |
-----------------------------------
| 0       | 2A3E | 100004 | Andy  | 
| 0       | 9P3W | 102506 | Judy  |
| 0       | 3R4Z | 120097 | Freddy| 
| 0       | 1G5Y | 103905 | Nikky |
| 0       | 2A3E | 103905 | Nikky | 
| 0       | 9P3W | 147001 | Johnny|
| 0       | 5K7V | 120097 | Teddy | 
| 0       | 6D8S | 120097 | Teddy |

Query:

WITH DataSource AS
(
    SELECT *, 
    COUNT(*) OVER (PARTITION BY Number) count_calc
    FROM Customers
)
SELECT *
FROM DataSource
WHERE count_calc > 1

Results:

| Deleted | ID   | Number | Name  |
-----------------------------------
| 0       | 1G5Y | 103905 | Nikky |
| 0       | 2A3E | 103905 | Nikky | 
| 0       | 5K7V | 120097 | Teddy | 
| 0       | 6D8S | 120097 | Teddy |

Now I want set Deleted = 1 for only one Nikky and one Teddy.

Tried DISTINCT and GROUP BY but didn't work. How can I modify the query so that it returns only one of these duplicates from Nikky and Teddy and not both? But at the same time returns all columns so that I can set Deleted of one to 1?

CodePudding user response:

You are part-way there, you need to use row_number over a window, then you can use an updatable CTE:

with d as (
    select * , Row_Number() over(partition by number order by id) rn
    from customers
)
update d set deleted=1 where rn>1
  • Related