I'm trying to apply a linear regression to a list of data frames, filtered for a specific value.
For example, having a list of Iris data frames, I'm trying to run a regression for Sepal.Length ~ Sepal.Width
, with each species running a different regression.
test <- dlply(iris, "Species", function(x) lm(Sepal.Length ~ Sepal.Width, data = x))
but i want to do it for a list of dataframes at the same time, rather than individually. So, say i had a list of multiple iris data frames:
iris1 <- iris
iris2 <- iris
iris3 <- iris
iris_list <- list(iris1, iris2, iris3)
So, to run the dlply code to iris_list
, I'm lost.
CodePudding user response:
Using lapply
,
res = lapply(iris_list, function(x){
test <- dlply(x, "Species", function(x) lm(Sepal.Length ~ Sepal.Width, data = x))
return(test)
})
CodePudding user response:
purrr
Try using purrr::map
if you want to apply a regression across a list of data frames:
library(purrr)
map(iris_list, ~ lm(Sepal.Length ~ Sepal.Width, data = .x))
If you want to split a data frame into a list based on a factor you can use split
from base R first:
imap(split(iris, ~ Species), ~ lm(Sepal.Length ~ Sepal.Width, data = .x))
Note: imap
is used in this case to preserve the list names, which are the levels of Species
in this case.
base R
If you only want to use base R and lapply
then this will work:
lapply(split(iris, ~ Species), function(x) lm(Sepal.Length ~ Sepal.Width, data = x))
If you have a nested list then you need to nest your mapping:
library(purrr)
map(iris_list, ~ imap(split(.x, ~ Species), ~ lm(Sepal.Length ~ Sepal.Width, data = .x)))
And again in base R:
lapply(iris_list, function(x) lapply(split(x, ~ Species), function(y) lm(Sepal.Length ~ Sepal.Width, data = y)))