Home > database >  How to use one of the variables to distinguish duplicated rows
How to use one of the variables to distinguish duplicated rows

Time:11-01

For example: we have 3*2 duplicated rows as follows:

name   identity   gender
Mary   student    female 
Mary   student    female
Mary   student    female
Jack   teacher    male
Jack   teacher    male
Jack   teacher    male

I wanna make those 3 rows as follows:

name    identity   gender
Mary1   student    female 
Mary2   student    female
Mary3   student    female
Jack1   teacher    male
Jack2   teacher    male
Jack3   teacher    male

How could I do it? Thanks

I try to use create function tvValues, but it didn't work.

CodePudding user response:

One approach uses row_number():

select t.*,
    case when count(*) over(partition by name, identity, gender) > 1
        then concat(name, row_number() over(partition by name, identity, gender order by name))
        else name
    end as new_name
from mytable t

This appends a sequential number to names that belong to a duplicate row.

We can turn the statement to an update with an updatable CTE :

with cte as (
    select name,
        count(*) over(partition by name, identity, gender) cnt,
        row_number() over(partition by name, identity, gender order by name) rn
    from mytable
)
update cte
set name = concat(name, rn)
where cnt > 1

CodePudding user response:

This can be accomplished with ROW_NUMBER() OVER(ORDER BY NAME, IDENTITY, GENDER)

You can change your order by if those need tweaked with ASC/DESC.

So you’re SELECT could start:

SELECT CONCAT(NAME, ROW_NUMBER() OVER(ORDER BY NAME, IDENTITY, GENDER)), IDENTITY, GENDER FROM …
  • Related