Home > Software design >  Separate columns that contain an unequal mixture of numbers and text in R
Separate columns that contain an unequal mixture of numbers and text in R

Time:03-02

I have a very odd data frame that looks like this

library(tidyverse)

df <- tibble(position=c(100,200,300),
             correction=c("62M89S", 
                     "8M1D55M88S",
                     "1S25M1P36M89S"))

df
#> # A tibble: 3 × 2
#>   position correction   
#>      <dbl> <chr>        
#> 1      100 62M89S       
#> 2      200 8M1D55M88S   
#> 3      300 1S25M1P36M89S

Created on 2022-03-02 by the reprex package (v2.0.1) the correction column will be used to correct the positions. I want to separate the correction column in a way that my data frame looks like this

df    col1  col2 col3 col4 col5
100                    62M  89S
200          8M  1D    55M  88S
300     1S  25M  1P    36M  89S

CodePudding user response:

Use tidyr::separate with fill = "left":

library(tidyr)
library(dplyr)
df %>% 
  separate(correction, into = str_c("col", 1:5), 
           sep = "(?<=\\D)(?=\\d)", fill = "left")

# A tibble: 3 x 6
  position col1  col2  col3  col4  col5 
     <dbl> <chr> <chr> <chr> <chr> <chr>
1      100 NA    NA    NA    62M   89S  
2      200 NA    8M    1D    55M   88S  
3      300 1S    25M   1P    36M   89S  
  • Related