I have a question related to R and nested lists. Let's assume I have a nested list with this structure:
library(tidyr)
library(purrr)
simul<-list(
"Q"=tibble(
months = seq(1,12,by=1),
run1 = runif(12),
run2 = runif(12),
run3= runif(12)),
"ET1"=tibble(
months = seq(1,12,by=1),
run1 = runif(12),
run2 = runif(12),
run3= runif(12)),
"ET2"=tibble(
months = seq(1,12,by=1),
run1 = runif(12),
run2 = runif(12),
run3= runif(12)),
"ET3"=tibble(
months = seq(1,12,by=1),
run1 = runif(12),
run2 = runif(12),
run3= runif(12))
)
I am trying to obtain some averages by variables ("ET") and by run, thus maintaining the lower-level list structure. The variables are grouped by their starting characters, and vary by the number at the end.
So far I solved my problem in this way, however, I was wondering if you could please suggest to me a better way, which I can apply easily to a larger list, with more "variables" and "runs".
nested_avg <-list("Q"=simul["Q"],
"ET"=tibble(months = simul$ET1$months,
run1=rowMeans(do.call(cbind, simul[c("ET1","ET2","ET3")]%>% map(2))),
run2=rowMeans(do.call(cbind, simul[c("ET1","ET2","ET3")]%>% map(3))),
run3=rowMeans(do.call(cbind, simul[c("ET1","ET2","ET3")]%>% map(4)))
))
Thank you so much for your answer.
CodePudding user response:
Some enframe
then deframe
with grouping does the trick:
simul %>%
setNames(str_remove_all(names(.), "\\d")) %>%
enframe() %>%
group_by(name) %>%
summarise(value = list(as_tibble(Reduce(" ", value) / n()))) %>%
deframe()
Note that the as_tibble
is necessary because Reduce
automatically converts to matrix.
CodePudding user response:
Looping and Map
ping (not purrr::map
, base::Map
in this instance):
vars <- startsWith(names(simul), "ET")
out <- list(
"Q" = simul["Q"],
"ET" = as_tibble(do.call(Map, c(\(...) rowMeans(cbind(...)), simul[vars])))
)
identical(out, nested_avg)
#[1] TRUE
And an alternative method flattening everything to an array:
out <- list(
"Q" = simul["Q"],
"ET" = as_tibble(apply(sapply(simul[vars], as.matrix, simplify="array"), 1:2, mean))
)
identical(out, nested_avg)
#[1] TRUE