Home > Software design >  Pull items from a list and output to a dataframe in R
Pull items from a list and output to a dataframe in R

Time:03-12

I have a list composed of several dataframes, and I want to iterate over the list and pull the ‘nth’ column of each dataframe, and group all these elements side by side on a dataframe.

Consider that I need to pull the second column this list:

library(tidyverse)    
mylist <- list(mt1 = mtcars, mt2 = mtcars*2, mt3 = mtcars*3)

I want a result similar to this, with cbind:

> mylist[[1]][2] %>% 
    cbind(mylist[[2]][2]) %>% 
    cbind(mylist[[3]][2]) %>% 
    head()
                  cyl cyl cyl
Mazda RX4           6  12  18
Mazda RX4 Wag       6  12  18
Datsun 710          4   8  12
Hornet 4 Drive      6  12  18
Hornet Sportabout   8  16  24
Valiant             6  12  18

But I need a code that can iterate over any number of list elements. That I would not need to rewrite depending on the number of list elements. How could I achieve it?

I can use a for loop, , but the output is different from what I need:

for (i in seq_along(mylist)){
  print(mylist[[i]] %>% select(2)) 
}

The same with sapply or lapply:

sapply(mylist, function(x) x%>% select(2))

lapply(mylist, function(x) x%>% select(2))

With map_df I get a dataframe, but with each row on top of each other:

> map_df(mylist, function(x) x%>% select(2)) %>% 
    head()
                      cyl
Mazda RX4...1           6
Mazda RX4 Wag...2       6
Datsun 710...3          4
Hornet 4 Drive...4      6
Hornet Sportabout...5   8
Valiant...6             6

How can I pull the columns from each dataframe on the list, and arrange each column side by side?

CodePudding user response:

You can use map_dfc rather than map_df, as it will bind the columns.

library(tidyverse)

map_dfc(mylist, select, 2) %>% 
   head()

#                  cyl...1 cyl...2 cyl...3
#Mazda RX4               6      12      18
#Mazda RX4 Wag           6      12      18
#Datsun 710              4       8      12
#Hornet 4 Drive          6      12      18
#Hornet Sportabout       8      16      24
#Valiant                 6      12      18

Also, if we want to assign a name (e.g., add a sequential number for each column), then we could use map2_dfc. You could also pass a different set of names.

map2_dfc(mylist,
         1:length(mylist),
         \(x, y) x %>% select(2) %>% rename(!!paste0(names(.)[1], y, sep = "") := 1)) %>%
  head()

#                  cyl1 cyl2 cyl3
#Mazda RX4            6   12   18
#Mazda RX4 Wag        6   12   18
#Datsun 710           4    8   12
#Hornet 4 Drive       6   12   18
#Hornet Sportabout    8   16   24
#Valiant              6   12   18

CodePudding user response:

Here's how I'd do that:

library(dplyr)

variable_number_to_get <- 2

newList <- lapply(mylist, function (x) x %>% select(variable_number_to_get)
bind_cols(newList)

                    cyl...1 cyl...2 cyl...3
Mazda RX4                 6      12      18
Mazda RX4 Wag             6      12      18
Datsun 710                4       8      12
Hornet 4 Drive            6      12      18
Hornet Sportabout         8      16      24
...

CodePudding user response:

Base R option -

do.call(cbind.data.frame, lapply(mylist, `[[`, 2))

#   mt1 mt2 mt3
#1    6  12  18
#2    6  12  18
#3    4   8  12
#4    6  12  18
#5    8  16  24
#6    6  12  18
#7    8  16  24
#8    4   8  12
#9    4   8  12
#10   6  12  18
#11   6  12  18
#...
#...
  • Related