I'm analysing survival data, and i am hoping to take a column of survival durations and calculate logical vectors based on a threshold of 1:24 - and put this together in a dataframe.
For example, take this sample data:
set.seed(1988)
test <- tibble(
survival = c(sample(1:40,10, replace =T))
)
I would like to rapidly create new columns titled "survival1","survival2" etc - up to "survival24" - with logical values based on whether survival > threshold.
As i'm most familiar with dplyr I've thus far been manually mutating e.g.
test %>% mutate(survival1 = survival > 1, survival2 = survival > 2)
But i thought there must be a better way!
CodePudding user response:
I can't figure out how to create and name the columns in a single step, but here is as far as I got:
library(tidyverse)
set.seed(1988)
test <- tibble(
survival = c(sample(1:40,10, replace =T))
)
test %>%
mutate(suppressMessages(map_dfc(1:24, ~ test$survival > .x))) %>%
rename_with(~ paste0("survival", 1:24), starts_with("..."))
#> # A tibble: 10 × 25
#> survival survival1 survival2 survival3 survival4 survival5 survival6
#> <int> <lgl> <lgl> <lgl> <lgl> <lgl> <lgl>
#> 1 18 TRUE TRUE TRUE TRUE TRUE TRUE
#> 2 32 TRUE TRUE TRUE TRUE TRUE TRUE
#> 3 2 TRUE FALSE FALSE FALSE FALSE FALSE
#> 4 34 TRUE TRUE TRUE TRUE TRUE TRUE
#> 5 38 TRUE TRUE TRUE TRUE TRUE TRUE
#> 6 19 TRUE TRUE TRUE TRUE TRUE TRUE
#> 7 20 TRUE TRUE TRUE TRUE TRUE TRUE
#> 8 12 TRUE TRUE TRUE TRUE TRUE TRUE
#> 9 23 TRUE TRUE TRUE TRUE TRUE TRUE
#> 10 7 TRUE TRUE TRUE TRUE TRUE TRUE
#> # … with 18 more variables: survival7 <lgl>, survival8 <lgl>, survival9 <lgl>,
#> # survival10 <lgl>, survival11 <lgl>, survival12 <lgl>, survival13 <lgl>,
#> # survival14 <lgl>, survival15 <lgl>, survival16 <lgl>, survival17 <lgl>,
#> # survival18 <lgl>, survival19 <lgl>, survival20 <lgl>, survival21 <lgl>,
#> # survival22 <lgl>, survival23 <lgl>, survival24 <lgl>
Created on 2022-07-08 by the reprex package (v2.0.1)
CodePudding user response:
Using base R
for(i in 1:24){
`[[`(test , paste0("survival" , i)) <-
sapply(test$survival ,\(x) x > i )
}
- Output
# A tibble: 10 × 25
survival survival1 survival2 survival3 survival4 survival5
<int> <lgl> <lgl> <lgl> <lgl> <lgl>
1 18 TRUE TRUE TRUE TRUE TRUE
2 32 TRUE TRUE TRUE TRUE TRUE
3 2 TRUE FALSE FALSE FALSE FALSE
4 34 TRUE TRUE TRUE TRUE TRUE
5 38 TRUE TRUE TRUE TRUE TRUE
6 19 TRUE TRUE TRUE TRUE TRUE
7 20 TRUE TRUE TRUE TRUE TRUE
8 12 TRUE TRUE TRUE TRUE TRUE
9 23 TRUE TRUE TRUE TRUE TRUE
10 7 TRUE TRUE TRUE TRUE TRUE
# … with 19 more variables: survival6 <lgl>, survival7 <lgl>,
# survival8 <lgl>, survival9 <lgl>, survival10 <lgl>,
# survival11 <lgl>, survival12 <lgl>, survival13 <lgl>,
# survival14 <lgl>, survival15 <lgl>, survival16 <lgl>,
# survival17 <lgl>, survival18 <lgl>, survival19 <lgl>,
# survival20 <lgl>, survival21 <lgl>, survival22 <lgl>,
# survival23 <lgl>, survival24 <lgl>