cutoff `KM Median rwToT (95% CI)` `Restricted mean rwToT @ 24 months (95% CI)`
<chr> <chr> <chr>
1 2017-01-01 2.1 (1.4 - 4.9) 7.2 (3.9 - 10.2) [LogNorm]
2 2017-04-01 3.5 (2.1 - 4.7) 8.9 (6.6 - 10.8) [LogNorm]
3 2017-07-01 3.7 (2.8 - 4.2) 7.2 (6.2 - 8.4) [Weibull]
I have this table. I am trying to extract and separate the numbers from KM Median rwToT (95% CI)
and Restricted mean rwToT @ 24 months (95% CI)
columns. I know I am supposed to use the regular expression but I am not sure how to extract the numbers inside the brackets.
Here is the sample data
structure(list(cutoff = c("2017-01-01", "2017-04-01", "2017-07-01"),
`KM Median rwToT (95% CI)` = c("2.1 (1.4 - 4.9)", "3.5 (2.1 - 4.7)", "3.7 (2.8 - 4.2)"), `Restricted mean rwToT @ 24 months (95% CI)` = c("7.2 (3.9 - 10.2) [LogNorm]",
"8.9 (6.6 - 10.8) [LogNorm]", "7.2 (6.2 - 8.4) [Weibull]")), row.names = c(NA,
-3L), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
Use separate
like this:
library(dplyr)
library(tidyr)
DF %>%
setNames(c("cutoff", "KM", "Rest")) %>%
separate(KM, into = c("KM", "KM_lo", "KM_hi", NA),
sep = "[^[:alnum:].] ", convert = TRUE) %>%
separate(Rest, into = c("Rest", "Rest_lo", "Rest_hi", "model", NA),
sep = "[^[:alnum:].] ", convert = TRUE)
giving:
# A tibble: 3 x 8
cutoff KM KM_lo KM_hi Rest Rest_lo Rest_hi model
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 2017-01-01 2.1 1.4 4.9 7.2 3.9 10.2 LogNorm
2 2017-04-01 3.5 2.1 4.7 8.9 6.6 10.8 LogNorm
3 2017-07-01 3.7 2.8 4.2 7.2 6.2 8.4 Weibull