Home > Back-end >  R extract multiple numbers from strings inside brackets Regular expression
R extract multiple numbers from strings inside brackets Regular expression

Time:09-29

cutoff     `KM Median rwToT (95% CI)` `Restricted mean rwToT @ 24 months (95% CI)`
  <chr>      <chr>                      <chr>                                       
1 2017-01-01 2.1 (1.4 - 4.9)            7.2 (3.9 - 10.2) [LogNorm]                  
2 2017-04-01 3.5 (2.1 - 4.7)            8.9 (6.6 - 10.8) [LogNorm]                  
3 2017-07-01 3.7 (2.8 - 4.2)            7.2 (6.2 - 8.4) [Weibull]    

I have this table. I am trying to extract and separate the numbers from KM Median rwToT (95% CI) and Restricted mean rwToT @ 24 months (95% CI) columns. I know I am supposed to use the regular expression but I am not sure how to extract the numbers inside the brackets.

Here is the sample data

structure(list(cutoff = c("2017-01-01", "2017-04-01", "2017-07-01"), 
        `KM Median rwToT (95% CI)` = c("2.1 (1.4 - 4.9)", "3.5 (2.1 - 4.7)", "3.7 (2.8 - 4.2)"), `Restricted mean rwToT @ 24 months (95% CI)` = c("7.2 (3.9 - 10.2) [LogNorm]", 
            "8.9 (6.6 - 10.8) [LogNorm]", "7.2 (6.2 - 8.4) [Weibull]")), row.names = c(NA, 
            -3L), class = c("tbl_df", "tbl", "data.frame"))

CodePudding user response:

Use separate like this:

library(dplyr)
library(tidyr)

DF %>%
  setNames(c("cutoff", "KM", "Rest")) %>%
  separate(KM, into = c("KM", "KM_lo", "KM_hi", NA), 
    sep = "[^[:alnum:].] ", convert = TRUE) %>%
  separate(Rest, into = c("Rest", "Rest_lo", "Rest_hi", "model", NA), 
    sep = "[^[:alnum:].] ", convert = TRUE)

giving:

# A tibble: 3 x 8
  cutoff        KM KM_lo KM_hi  Rest Rest_lo Rest_hi model  
  <chr>      <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl> <chr>  
1 2017-01-01   2.1   1.4   4.9   7.2     3.9    10.2 LogNorm
2 2017-04-01   3.5   2.1   4.7   8.9     6.6    10.8 LogNorm
3 2017-07-01   3.7   2.8   4.2   7.2     6.2     8.4 Weibull
  • Related