After some digging (1, 2, 3), there appears to be a few posts about using formulas within functions causing scoping issues if I am understanding them correctly. Some suggests to use an environment, assign, or <<-
to get around the issue, but I've been stumped how to use them (and confused why there's an issue in the first place).
Let's try this toy code:
library(survival)
library(survminer)
set.seed(1)
give_p_val <- function() {
df <- data.frame('OS' = ovarian[, 'futime'], 'Survival_event' = ovarian[, 'fustat'])
subgroup <- sample(nrow(df), nrow(df)/2)
df$Class <- 'A'
df$Class[subgroup] <- 'B'
fit2 <- survfit(Surv(OS, Survival_event) ~ Class, data=df)
return(surv_pvalue(fit2))
}
give_p_val( )
It doesn't work, unless you run it directly, which hints at a scoping issue.
This code will work to return a fitted object:
survfit(Surv(futime, fustat) ~ rx, data=ovarian)
So why does the function break if we copy a dataframe within the scope?
testit<-function(){
ovarian2 <- ovarian
fit2 <- survfit(Surv(futime, fustat) ~ rx, data=ovarian2)
return(surv_pvalue(fit2))
}
testit()
Ultimately, how do I generate a dataframe within a function to be handled correctly by the formula being used? Thanks!
CodePudding user response:
This is a known issue in the survminer
package and how surv_pvalue
interacts with survfit
objects. If you change survfit
to the survminer
package's version, surv_fit
, your function will work just fine.
give_p_val <- function() {
df <- data.frame('OS' = ovarian[, 'futime'], 'Survival_event' = ovarian[, 'fustat'])
subgroup <- sample(nrow(df), nrow(df)/2)
df$Class <- 'A'
df$Class[subgroup] <- 'B'
fit2 <- surv_fit(Surv(OS, Survival_event) ~ Class, data=df)
surv_pvalue(fit2, data = df)
}
give_p_val( )
# variable pval method pval.txt
# 1 Class 0.2615363 Log-rank p = 0.26