Home > Net >  How to add sequential values to identified duplicates before first character?
How to add sequential values to identified duplicates before first character?

Time:03-12

I would like to identify duplicates and then add the sequential number before the first character. In the script below i identified the duplicates

I have a dataset that looks like this

col|
X123
X123
X456
X789
X890
X142
X142
X142


df$col<- ifelse(duplicated(df[,c("col")])|duplicated(df[,c("col")],fromLast = TRUE),
                      make.unique(df$col),df$col)

What my script ends up doing is this

col|
X123
X123.1
X456
X789
X890
X142
X142.1
X142.2

What I would like for it to do is

col|
1X123
2X123
X456
X789
X890
1X142
2X142
3X142

CodePudding user response:

1) Define a function which prepends sequence numbers and then use it with ave.

add_seq <- function(x) if (length(x) == 1) x else paste0(seq_along(x), x)
transform(DF, col = ave(col, col, FUN = add_seq))

giving:

    col
1 1X123
2 2X123
3  X456
4  X789
5  X890
6 1X142
7 2X142
8 3X142

2) A variation which uses the idea of incorporating duplicated, as in the question, is the following. It gives the same result.

transform(DF, col = (duplicated(col) | duplicated(col, fromLast = TRUE)) |>
                      ifelse(ave(col, col, FUN = seq_along), "") |>
                      paste0(col))

Note

Lines <- "col
X123
X123
X456
X789
X890
X142
X142
X142"
DF <- read.table(text = Lines, header = TRUE, strip.white = TRUE)

CodePudding user response:

This uses data.table. We first add two columns by reference, id, which holds the row number per group, and N which holds the total number of rows per group. We then use an if-else statement (using data.table::fifelse) to paste the row_number to the colum if the total number of rows is more than 1. We do this by row. The final line drops the temp id and N columns

library(data.table)

setDT(df)[, `:=`(id=1:.N, N=.N), by=col] %>% 
  .[,col:=fifelse(N>1,paste0(id,col),col), by=1:nrow(df)] %>% 
  .[,`:=`(id=NULL, N=NULL)]

     col
   <char>
1:  1X123
2:  2X123
3:   X456
4:   X789
5:   X890
6:  1X142
7:  2X142
8:  3X142
  • Related