I could use some help with a custom function that I've been working on that takes the parallel minima of a vector of variables (all date of diagnosis variables, with dx dates for cases and NA for noncases).
I want to create 2 variables:
- {prefix}flag1 sets all nonmissing values to 1 and all missing values to 0
- {prefix}flag2 finds the earliest possible date of the vector of variables for each observation/ row.
I've been able to generate a function using the invoke() wrapper; however, since the rlang function is being depreciated, I'm trying to execute the same function using exec() or inject() instead.
The actual dataset has 100 columns, but I am supplying a minimally reproducible example below.
# load libs
library(tidyverse)
library(lubridate)
rdate <- function(x,
min = paste0(format(Sys.Date(), '%Y'), '-01-01'),
max = paste0(format(Sys.Date(), '%Y'), '-12-31'),
sort = TRUE) {
dates <- sample(seq(as.Date(min), as.Date(max), by = "day"), x, replace = TRUE)
if (sort == TRUE) {
sort(dates)
} else {
dates
}
}
# set seed for reproducibility
set.seed(42)
# generating ds of random dates
date_ds <- data.frame(v.ar1 = rdate(15),
var_2 = rdate(15),
random3 = rdate(15),
v4= rdate(15),
v__5 = rdate(15),
variable6 = rdate(15),
dates7 = rdate(15)) %>%
# remove some observations
mutate(var_2 = as_date(ifelse(var_2 < dates7, var_2, NA)),
variable6 = as_date(ifelse(variable6 <= v__5, variable6, NA)),
v4 = as_date(ifelse(v4 > v.ar1, v4, NA)),
)
# vector of names
varnames <- names(date_ds)
# function
collapse_phenos <- function(varvector, prefix){
outds <- date_ds %>%
mutate(!!paste0(prefix,"flag1") := (rowSums(!is.na(select(., all_of(varvector)))) > 0),
!!paste0(prefix,"flag2") := invoke(pmin, c(across(all_of(varvector)), na.rm = TRUE)))
return(outds)
}
# return ds
result <- collapse_phenos(varvector = varnames, prefix = "")
Any help would be appreciated! A tidyverse solution would be preferable.
CodePudding user response:
We can modify the function with invoke
to exec
using the splice operator (!!!
) after converting the strings to symbols (syms
)
collapse_phenos <- function(varvector, prefix){
outds <- date_ds %>%
mutate(!!paste0(prefix,"flag1") := (rowSums(!is.na(select(., all_of(varvector)))) > 0),
!!paste0(prefix,"flag2") := exec(pmin, !!! rlang::syms(varvector), na.rm = TRUE))
return(outds)
}
-testing
# return ds
result <- collapse_phenos(varvector = varnames, prefix = "")
-output
> result
v.ar1 var_2 random3 v4 v__5 variable6 dates7 flag1 flag2
1 2022-01-24 <NA> 2022-01-04 <NA> 2022-01-16 <NA> 2022-01-01 1 2022-01-01
2 2022-02-18 2022-01-20 2022-04-19 <NA> 2022-02-04 <NA> 2022-02-11 1 2022-01-20
3 2022-02-18 2022-01-24 2022-04-24 <NA> 2022-03-17 2022-03-01 2022-02-11 1 2022-01-24
4 2022-03-15 <NA> 2022-05-16 <NA> 2022-03-23 <NA> 2022-03-13 1 2022-03-13
5 2022-03-30 <NA> 2022-05-26 2022-04-13 2022-04-20 2022-04-10 2022-04-18 1 2022-03-30
6 2022-05-02 2022-04-20 2022-07-16 <NA> 2022-04-28 <NA> 2022-04-22 1 2022-04-20
7 2022-05-08 <NA> 2022-08-03 2022-05-10 2022-05-10 <NA> 2022-05-06 1 2022-05-06
8 2022-05-26 <NA> 2022-08-14 <NA> 2022-08-08 2022-07-27 2022-05-21 1 2022-05-21
9 2022-06-02 <NA> 2022-09-02 2022-06-06 2022-08-09 <NA> 2022-07-25 1 2022-06-02
10 2022-06-14 <NA> 2022-10-19 2022-07-05 2022-09-05 <NA> 2022-09-05 1 2022-06-14
11 2022-08-16 <NA> 2022-10-26 <NA> 2022-09-22 <NA> 2022-10-12 1 2022-08-16
12 2022-10-30 2022-10-25 2022-11-10 <NA> 2022-11-21 2022-10-15 2022-10-26 1 2022-10-15
13 2022-11-17 <NA> 2022-11-20 <NA> 2022-11-21 2022-10-23 2022-11-07 1 2022-10-23
14 2022-11-23 <NA> 2022-12-14 2022-11-25 2022-12-23 2022-10-25 2022-12-03 1 2022-10-25
15 2022-12-22 <NA> 2022-12-21 2022-12-24 2022-12-26 2022-12-05 2022-12-07 1 2022-12-05