Here is the data:
Subject code | Name |
---|---|
401 | John |
422 | Mary |
463 | Peter |
And I would like to create unique id based on the last two digit of the subject code. For example:
ID | Subject code | Name |
---|---|---|
S01 | 401 | John |
S22 | 422 | Mary |
S63 | 463 | Peter |
Which library should I use? Should I use case_when() in this situation?
CodePudding user response:
You can use str_extract
and str_c
from the stringr
package:
library(tidyverse)
df %>%
mutate(ID = str_c("S", str_extract(Subject_code, "\\d{2}$")))
Subject_code ID
1 401 S01
2 422 S22
3 463 S63
The regex pattern \\d{2}$
matches the two d
igits that occur in string-final ($
) position and extracts them.
Data:
df <- data.frame(
Subject_code = c(401, 422, 463))
CodePudding user response:
You can use substr
paste0
:
data$ID <- paste0("S", substr(data$`Subject code`, 2, 3))
e.g.:
paste0("S", substr(431, 2, 3))
#[1] "S31"
or in dplyr
:
library(dplyr)
data %>%
mutate(ID = paste0("S", substr(`Subject code`, 2, 3))
CodePudding user response:
We can try sub
like below
> transform(df, ID = sub(".", "s", SubjectCode))[c(3, 1, 2)]
ID SubjectCode Name
1 s01 401 John
2 s22 422 Mary
3 s63 463 Peter