Home > Enterprise >  Regression through different data frames
Regression through different data frames

Time:04-06

I have several dataframes (with different names) like those below, with the same number of rows and columns but different names for the last column.

df1:

ID  matching_variable   STATUS  code_842
1   1.1 1   1
2   1.1 0   1
3   1.2 1   0
4   1.2 1   0

df2:

ID  matching_variable   STATUS  code_853
1   1.1 1   1
2   1.1 0   0
3   1.2 1   0
4   1.2 1   1

I have about a dozen df's like this and I would like to do a logistic regression of this style for each df:

fit1<-clogit(STATUS~code_842 strata(matching_variable),data=df1)
fit2<-clogit(STATUS~code_853 strata(matching_variable),data=df2)

etc….

I would like to make a function to "automate" this (without having to write all the regressions) and have all the outputs of the regressions in a new table.

I thought of using something like this function: (but as I have different names for the df and for the last column, I get stuck...)

list<-list(df1,df2)

results<- lapply(list, function(x) {clogit(STATUS ~ code_???   strata(matching_variable), data=???, l)})

Thank you in advance.

CodePudding user response:

Another possible solution, based on purrr::map:

library(purrr)
library(survival)

map(list(df1, df2), ~ clogit(STATUS ~ .x[,4]   strata(matching_variable), data=.x)) 

#> Warning in coxexact.fit(X, Y, istrat, offset, init, control, weights =
#> weights, : Ran out of iterations and did not converge
#> [[1]]
#> Call:
#> clogit(STATUS ~ .x[, 4]   strata(matching_variable), data = .x)
#> 
#>         coef exp(coef) se(coef)  z  p
#> .x[, 4]   NA        NA        0 NA NA
#> 
#> Likelihood ratio test=0  on 0 df, p=1
#> n= 4, number of events= 3 
#> 
#> [[2]]
#> Call:
#> clogit(STATUS ~ .x[, 4]   strata(matching_variable), data = .x)
#> 
#>              coef exp(coef)  se(coef)     z     p
#> .x[, 4] 2.020e 01 5.943e 08 2.438e 04 0.001 0.999
#> 
#> Likelihood ratio test=1.39  on 1 df, p=0.239
#> n= 4, number of events= 3

CodePudding user response:

Make a custom function that finds last column and uses it in clogit as formula, something like below, not tested:

myClogit <- function(d){
  lastColName <- tail(colnames(d), 1)
  f <- as.formula(
    paste("STATUS ~", lastColName, "  strata(matching_variable)"))
  clogit(f, data = d)
  }

Then make a list of dataframes and loop:

lapply(list(df1, df2), myClogit)
  • Related