For example, one column of the table I have is like this
HGVS.Consequence
Lys10Arg
Lys10Lys
LeullLeu
Phe12Ser
Phe12Cys
lle13Leu
lle13Val
lle13Phe
Thr15Pro
And I want a table like this.
Mutation Ref Change Position
lle13Val lle Val 13
lle13Phe lle Phe 13
Thr15Pro Thr Pro 15
CodePudding user response:
tidyr::extract(df, HGVS.Consequence,
c('Ref', 'Position', 'Change'), '(\\D )(\\d )(\\D )', remove = FALSE)
CodePudding user response:
Using tidyr::separate
and tidying the ordering / names with dplyr
:
tidyr::separate(data = df,
col = HGVS.Consequence,
into = c("Ref", "Position", "Change"),
sep = c(3, 5, 8),
remove = FALSE) |>
dplyr::select(1, 2, 4, 3) |>
dplyr::rename(mutation = HGVS.Consequence)
#> mutation Ref Change Position
#> 1 Lys10Arg Lys Arg 10
#> 2 Lys10Lys Lys Lys 10
#> 3 LeullLeu Leu Leu ll
#> 4 Phe12Ser Phe Ser 12
#> 5 Phe12Cys Phe Cys 12
#> 6 lle13Leu lle Leu 13
#> 7 lle13Val lle Val 13
#> 8 lle13Phe lle Phe 13
#> 9 Thr15Pro Thr Pro 15
CodePudding user response:
Code
Here is a base R way with substr
.
sepfun <- function(x){
s1 <- substr(x, 1, 3)
s2 <- substr(x, 4, 5)
s3 <- substring(x, 6)
y <- do.call(cbind.data.frame, list(s1, s3, s2))
names(y) <- c("Ref", "Change", "Position")
cbind(Mutation = x, y)
}
sepfun(df1$HGVS.Consequence)
#> Mutation Ref Change Position
#> 1 Lys10Arg Lys Arg 10
#> 2 Lys10Lys Lys Lys 10
#> 3 LeullLeu Leu Leu ll
#> 4 Phe12Ser Phe Ser 12
#> 5 Phe12Cys Phe Cys 12
#> 6 lle13Leu lle Leu 13
#> 7 lle13Val lle Val 13
#> 8 lle13Phe lle Phe 13
#> 9 Thr15Pro Thr Pro 15
Created on 2022-02-13 by the reprex package (v2.0.1)
Data
HGVS.Consequence<-scan(text = '
Lys10Arg
Lys10Lys
LeullLeu
Phe12Ser
Phe12Cys
lle13Leu
lle13Val
lle13Phe
Thr15Pro
', sep = "\n", what = character())
df1 <- data.frame(HGVS.Consequence)
Created on 2022-02-13 by the reprex package (v2.0.1)