Home > Back-end >  Looping lm models of column in a list of dataframes and outputting dataframes showing the slope and
Looping lm models of column in a list of dataframes and outputting dataframes showing the slope and

Time:04-01

I want to loop lm() models for variable i (response) with an explanatory variable in a list of dataframes that are split by factor. Lastly, I want to create two dataframes that will show the lm coefficients: the first will show the slope and the second the p.value with response variables tested in the models as cols and factor levels in rows.

I managed to run and print the output of the summary of the lm models, but not sure how to create the appropriate slope and p.value dataframes.

Here is what I've done:

data (iris)
iris_split = split (iris,f=iris$Species) ### Split the data by factor "Species"

I want to run lm models for each of the following variables (treated as responses for the sake of the question) with Petal.Width

vars = as.vector (unique (colnames (subset (iris, select = -c(Species, Petal.Width )))))
#Output:
#> vars
#[1] "Sepal.Length" "Sepal.Width"  "Petal.Length"
iris_lm = for (i in vars) { # loop across vars
  lm_summary = lapply (iris_split, FUN = function(x) 
                summary(lm (x[,i] ~ x[,"Petal.Width"]))) #Where (x) is levels of factors "Species"
                print(i) # so I could see which variable is tested in the model
                print(lm_summary)
}

How do I create the slop.df and p.val.df? They need to look like this:

#> slop.df
#     Species Sepal.Length Sepal.Width Petal.Length
#1     setosa       slope?      slope?       slope?
#2 versicolor       slope?      slope?       slope?
#3  virginica       slope?      slope?       slope?

The actual slopes need to be shown instead of the "slope?" placeholder, and the same goes for p.val.df

CodePudding user response:

packages from the [tidyverse][1] make this fairly convenient:

iris %>% 
    pivot_longer(-c(Species, Petal.Width),
                 names_to = 'variable',
                 values_to = 'value'
                 ) %>% 
    group_by(Species, variable) %>% 
    ## mind to return the model results as a list!
    summarise(model_summary = list(summary(lm(Petal.Width ~ value)))) %>% 
    rowwise %>%
    mutate(slope = model_summary$coefficients[2, 'Estimate'],
           ## p = model_summary$coefficients[2, 'Pr(>|t|)']
           ) %>%
    ungroup %>%
    pivot_wider(id_cols = Species,
                names_from = 'variable',
                values_from = 'slope')
  • Related