How to create unique ID for each subject?-CodePudding

Here is the data:

Subject code	Name
401	John
422	Mary
463	Peter

And I would like to create unique id based on the last two digit of the subject code. For example:

ID	Subject code	Name
S01	401	John
S22	422	Mary
S63	463	Peter

Which library should I use? Should I use case_when() in this situation?

CodePudding user response：

You can use str_extractand str_c from the stringr package:

library(tidyverse)
df %>%
  mutate(ID = str_c("S", str_extract(Subject_code, "\\d{2}$")))
  Subject_code  ID
1          401 S01
2          422 S22
3          463 S63

The regex pattern \\d{2}$ matches the two digits that occur in string-final ($) position and extracts them.

Data:

df <- data.frame(
  Subject_code = c(401, 422, 463))

CodePudding user response：

You can use substr paste0:

data$ID <- paste0("S", substr(data$`Subject code`, 2, 3))

e.g.:

paste0("S", substr(431, 2, 3))
#[1] "S31"

or in dplyr:

library(dplyr)
data %>%
  mutate(ID = paste0("S", substr(`Subject code`, 2, 3))

CodePudding user response：

We can try sub like below

> transform(df, ID = sub(".", "s", SubjectCode))[c(3, 1, 2)]
   ID SubjectCode  Name
1 s01         401  John
2 s22         422  Mary
3 s63         463 Peter