I would like to name each row in this data frame based on name:
x <- data.frame(
name = c('a', 'a', 'b', 'c', 'c', 'c'),
id = c(2324545, 343245, 35435, 546565, 67432, 87865)
)
Final result should be:
name id new_name
1 a 2324545 a_01
2 a 343245 a_02
3 b 35435 b_01
4 c 546565 c_01
5 c 67432 c_02
6 c 87865 c_03
How is this possible? Thanks
CodePudding user response:
You can do:
library(dplyr)
x %>%
group_by(name) %>%
mutate(new_name = paste(name, str_pad(row_number(), 2, pad = '0'), sep = '_'))
# A tibble: 6 × 3
# Groups: name [3]
name id new_name
<chr> <dbl> <chr>
1 a 2324545 a_01
2 a 343245 a_02
3 b 35435 b_01
4 c 546565 c_01
5 c 67432 c_02
6 c 87865 c_03
CodePudding user response:
I would use aggregate()
to calculate the "within-name ID number", paste0()
to concatenate the strings, and formatC()
to add leading zeroes.
x <- data.frame(
name = c('a', 'a', 'b', 'c', 'c', 'c'),
id = c(2324545, 343245, 35435, 546565, 67432, 87865)
)
x$new_name <- paste0(x$name, "_",
formatC(
do.call('c', aggregate(rep(1, length(x$name)),
list(factor(x$name)),
FUN = cumsum)$x),
width = 2,
format = "d",
flag = "0"
))
x
#> name id new_name
#> 1 a 2324545 a_01
#> 2 a 343245 a_02
#> 3 b 35435 b_01
#> 4 c 546565 c_01
#> 5 c 67432 c_02
#> 6 c 87865 c_03
CodePudding user response:
With data.table
library(data.table)
setDT(x)[, new_name := sprintf('%s_d', name, rowid(name))]
-output
> x
name id new_name
<char> <num> <char>
1: a 2324545 a_01
2: a 343245 a_02
3: b 35435 b_01
4: c 546565 c_01
5: c 67432 c_02
6: c 87865 c_03