> mylist
$result.1
truth model1 model2
1 1 2 1.0
2 2 3 -0.5
3 3 -1 4.0
$result.2
truth model1 model2
1 1 1 2
2 2 4 2
3 3 4 1
I have a list that contains a number of sublists. In the example above, it's 2, but the number of sublists can be more than 2.
In each sublist, there's a data.frame
that contains the truth
and predictions from model1
and model2
. I want to reshape my list so that each sublist corresponds to a specific model, i.e., I would like:
$model1
truth result.1 result.2
1 1 2 1
2 2 3 4
3 3 -1 4
$model2
truth result.1 result.2
1 1 1.0 2
2 2 -0.5 2
3 3 4.0 1
Is there a quick way to reshape the list this way?
CodePudding user response:
Using cbind
in a do.call
.
lapply(1:length(L), \(i) do.call(cbind, c(L$result.1[, 1, F], lapply(L, `[[`, i)))) |>
setNames(names(L))
# $model1
# truth result.1 result.2
# [1,] 1 2 1
# [2,] 2 3 4
# [3,] 3 -1 4
#
# $model2
# truth result.1 result.2
# [1,] 1 1.0 2
# [2,] 2 -0.5 2
# [3,] 3 4.0 1
Note: R >= 4.1
Data:
L <- list(result.1 = structure(list(truth = 1:3, model1 = c(2, 3,
-1), model2 = c(1, -0.5, 4)), class = "data.frame", row.names = c(NA,
-3L)), result.2 = structure(list(truth = 1:3, model1 = c(1, 4,
4), model2 = c(2, 2, 1)), class = "data.frame", row.names = c(NA,
-3L)))
CodePudding user response:
Consider iterating across distinct model column names with a chain merge:
newlist <- sapply(
names(mylist$result.1)[-1],
function(nm) {
df <- Reduce(
function(x, y) merge(x, y, by="truth"),
lapply(mylist, `[`, c("truth", nm))
)
df <- setNames(df, c("truth", paste0("result.", 1:(ncol(df)-1))))
},
simplify = FALSE
)
newlist
$model1
truth result.1 result.2
1 1 2 1
2 2 3 4
3 3 -1 4
$model2
truth result.1 result.2
1 1 1.0 2
2 2 -0.5 2
3 3 4.0 1
CodePudding user response:
Here's a tidyverse option: if you bind all the data frames into one and use the list's names to mark off which model the data comes from, it becomes a simple transpose operation. Then you can split again by model.
I added an additional model to test how it scales: you don't hard-code the number of trials or models or their names, and if a model is missing for one trial, you'll have NAs but no errors.
library(dplyr)
mylist %>%
bind_rows(.id = "trial") %>%
tidyr::pivot_longer(matches("model\\d "), names_to = "model") %>%
tidyr::pivot_wider(names_from = trial) %>%
split(.$model) %>%
purrr::map(select, -model)
#> $model1
#> # A tibble: 3 × 3
#> truth result.1 result.2
#> <int> <dbl> <dbl>
#> 1 1 2 1
#> 2 2 3 4
#> 3 3 -1 4
#>
#> $model2
#> # A tibble: 3 × 3
#> truth result.1 result.2
#> <int> <dbl> <dbl>
#> 1 1 1 2
#> 2 2 -0.5 2
#> 3 3 4 1
#>
#> $model3
#> # A tibble: 3 × 3
#> truth result.1 result.2
#> <int> <dbl> <dbl>
#> 1 1 0 9
#> 2 2 4 5
#> 3 3 2 2
Data from jay.sf's answer plus another dummy column
mylist <- list(result.1 = structure(list(truth = 1:3, model1 = c(2, 3, -1), model2 = c(1, -0.5, 4), model3 = c(0, 4, 2)), class = "data.frame", row.names = c(NA, -3L)), result.2 = structure(list(truth = 1:3, model1 = c(1, 4, 4), model2 = c(2, 2, 1), model3 = c(9, 5, 2)), class = "data.frame", row.names = c(NA, -3L)))