Home > Mobile >  How to regress a list of covariates with a desired predictor and dependent variable and return a tab
How to regress a list of covariates with a desired predictor and dependent variable and return a tab

Time:12-09

I have a dataset with a rather large amount of variables. In the dataset I have a predictor and an outcome variable I want to investigate. I want to find covariates with either a significant effect on the outcome variable, or a significant interaction effect between the predictor and the covariate on the outcome variable.

It would therefore be convenient to be able to regress all the covariates in turn with the desired predictor on the dependent variable and create a table over the effects and interaction effects of the covariates with their respective p-values.

I want to do something like this:

library(dplyr)

# Generating sample data
set.seed(5)
df <- data.frame(matrix(round(abs(2*rnorm(100*100)), digits = 0), ncol=100))

# Selecting covariates
covar <- names(df)[! names(df) %in% c("X1", "X2")]

# Running the lm function over the list of covariates. I should get the covariate coefficients from each regression, but I get an error when I try run this step.

coeff <- lapply(covar, function(x){ 
# Retrive coefficient matrix
    summary(lm(X1 ~ X2   x   X2*x, df))$coefficients %>% 
# Coerce into dataframe and filter for covariates and interaction effects
    as.data.frame(.) %>%
    filter(row.names(.) %in% grep(x, rownames(.), value = 
    TRUE))}) %>%
# Finally I want to join all data frames into one
    bind_rows(.)

I could use some help with the syntax. I get the following error when I try to run the function:

Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'summary': variable lengths differ (found for 'x')

CodePudding user response:

When you use x(in lapply) inside function, it might be better using paste for model formula instead of just specifying it's formula.

lapply(covar, function(x){ 
  modd <- paste0("X1 ~ X2  ", x, "  X2 *", x)
  summary(lm(modd, df))$coefficients %>% 
    as.data.frame(.) %>%
    filter(row.names(.) %in% grep(x, rownames(.), value = 
                                    TRUE))}) %>%
  bind_rows(.)
  • Related