Home > Software design >  Lapply for dlply
Lapply for dlply

Time:10-16

I'm trying to apply a linear regression to a list of data frames, filtered for a specific value. For example, having a list of Iris data frames, I'm trying to run a regression for Sepal.Length ~ Sepal.Width, with each species running a different regression.

test <- dlply(iris, "Species", function(x) lm(Sepal.Length ~ Sepal.Width, data = x))

but i want to do it for a list of dataframes at the same time, rather than individually. So, say i had a list of multiple iris data frames:

 iris1 <- iris
 iris2 <- iris
 iris3 <- iris
iris_list <- list(iris1, iris2, iris3) 

So, to run the dlply code to iris_list, I'm lost.

CodePudding user response:

Using lapply,

res = lapply(iris_list, function(x){
  test <- dlply(x, "Species", function(x) lm(Sepal.Length ~ Sepal.Width, data = x))
  return(test)
})

CodePudding user response:

purrr

Try using purrr::map if you want to apply a regression across a list of data frames:

library(purrr)

map(iris_list, ~ lm(Sepal.Length ~ Sepal.Width, data = .x))

If you want to split a data frame into a list based on a factor you can use split from base R first:

imap(split(iris, ~ Species), ~ lm(Sepal.Length ~ Sepal.Width, data = .x))

Note: imap is used in this case to preserve the list names, which are the levels of Species in this case.

base R

If you only want to use base R and lapply then this will work:

lapply(split(iris, ~ Species), function(x) lm(Sepal.Length ~ Sepal.Width, data = x))

If you have a nested list then you need to nest your mapping:

library(purrr)

map(iris_list, ~ imap(split(.x, ~ Species), ~ lm(Sepal.Length ~ Sepal.Width, data = .x)))

And again in base R:

lapply(iris_list, function(x) lapply(split(x, ~ Species), function(y) lm(Sepal.Length ~ Sepal.Width, data = y)))
  • Related