I have a list composed of several dataframes, and I want to iterate over the list and pull the ‘nth’ column of each dataframe, and group all these elements side by side on a dataframe.
Consider that I need to pull the second column this list:
library(tidyverse)
mylist <- list(mt1 = mtcars, mt2 = mtcars*2, mt3 = mtcars*3)
I want a result similar to this, with cbind:
> mylist[[1]][2] %>%
cbind(mylist[[2]][2]) %>%
cbind(mylist[[3]][2]) %>%
head()
cyl cyl cyl
Mazda RX4 6 12 18
Mazda RX4 Wag 6 12 18
Datsun 710 4 8 12
Hornet 4 Drive 6 12 18
Hornet Sportabout 8 16 24
Valiant 6 12 18
But I need a code that can iterate over any number of list elements. That I would not need to rewrite depending on the number of list elements. How could I achieve it?
I can use a for loop, , but the output is different from what I need:
for (i in seq_along(mylist)){
print(mylist[[i]] %>% select(2))
}
The same with sapply or lapply:
sapply(mylist, function(x) x%>% select(2))
lapply(mylist, function(x) x%>% select(2))
With map_df I get a dataframe, but with each row on top of each other:
> map_df(mylist, function(x) x%>% select(2)) %>%
head()
cyl
Mazda RX4...1 6
Mazda RX4 Wag...2 6
Datsun 710...3 4
Hornet 4 Drive...4 6
Hornet Sportabout...5 8
Valiant...6 6
How can I pull the columns from each dataframe on the list, and arrange each column side by side?
CodePudding user response:
You can use map_dfc
rather than map_df
, as it will bind the columns.
library(tidyverse)
map_dfc(mylist, select, 2) %>%
head()
# cyl...1 cyl...2 cyl...3
#Mazda RX4 6 12 18
#Mazda RX4 Wag 6 12 18
#Datsun 710 4 8 12
#Hornet 4 Drive 6 12 18
#Hornet Sportabout 8 16 24
#Valiant 6 12 18
Also, if we want to assign a name (e.g., add a sequential number for each column), then we could use map2_dfc
. You could also pass a different set of names.
map2_dfc(mylist,
1:length(mylist),
\(x, y) x %>% select(2) %>% rename(!!paste0(names(.)[1], y, sep = "") := 1)) %>%
head()
# cyl1 cyl2 cyl3
#Mazda RX4 6 12 18
#Mazda RX4 Wag 6 12 18
#Datsun 710 4 8 12
#Hornet 4 Drive 6 12 18
#Hornet Sportabout 8 16 24
#Valiant 6 12 18
CodePudding user response:
Here's how I'd do that:
library(dplyr)
variable_number_to_get <- 2
newList <- lapply(mylist, function (x) x %>% select(variable_number_to_get)
bind_cols(newList)
cyl...1 cyl...2 cyl...3 Mazda RX4 6 12 18 Mazda RX4 Wag 6 12 18 Datsun 710 4 8 12 Hornet 4 Drive 6 12 18 Hornet Sportabout 8 16 24 ...
CodePudding user response:
Base R option -
do.call(cbind.data.frame, lapply(mylist, `[[`, 2))
# mt1 mt2 mt3
#1 6 12 18
#2 6 12 18
#3 4 8 12
#4 6 12 18
#5 8 16 24
#6 6 12 18
#7 8 16 24
#8 4 8 12
#9 4 8 12
#10 6 12 18
#11 6 12 18
#...
#...