Home > Blockchain >  make a join of all the data frames inside a list in R
make a join of all the data frames inside a list in R

Time:08-24

I have listed 3 data frames. The 3 data frames have a variable in common and I would like to make a full outer join of the three data frames. I know I can iterate the elements of the list, but is there any other way of making this?

CodePudding user response:

I guess you can try Reduce to merge all the data.frames iteratively, e.g.,

Reduce(function(x, y) merge(x, y, all = TRUE), list(df1, df2, df3))

CodePudding user response:

If you only have 3 data.frames, I'd recommend joining them manually. Here's a minimal reproducible example:

# 3 data.frames
df1 <- data.frame(a=c(1:3), b=letters[1:3])
df2 <- data.frame(c=c(2,3,4), d=c(letters[1:3]))
df3 <- data.frame(e=c(5:7), f=c(letters[1:3]))

df1
#   a b
# 1 1 a
# 2 2 b
# 3 3 c
df2
#  c d
# 1 2 a
# 2 3 b
# 3 4 c
df3
#  e f
# 1 5 a
# 2 6 b
# 3 7 c

Now full_join them:

library(tidyverse)
df1 %>% 
  full_join(df2, by = c("a"="c")) %>% 
  full_join(df3, by = c("a"="e")) 

#   a    b    d    f
# 1 1    a <NA> <NA>
# 2 2    b    a <NA>
# 3 3    c    b <NA>
# 4 4 <NA>    c <NA>
# 5 5 <NA> <NA>    a
# 6 6 <NA> <NA>    b
# 7 7 <NA> <NA>    c

Note: since you mention the data.frames are inside a list, here's how you could access them:

df_list <- list(df1, df2, df3)

df_list[[1]] %>% 
  full_join(df_list[[2]], by = c("a"="c")) %>% 
  full_join(df_list[[3]], by = c("a"="e")) 
# gives same result as above

CodePudding user response:

You can also use purrr's reduce for this, e.g.

library(purrr)
library(dplyr)
purrr::reduce(df_list, full_join, by = "a")

Data:

df1 <- data.frame(a=c(1:3), b=letters[1:3])
df2 <- data.frame(a=c(2,3,4), d=c(letters[1:3]))
df3 <- data.frame(a=c(5:7), f=c(letters[1:3]))
df_list <- list(df1, df2, df3)

Output:

  a    b    d    f
1 1    a <NA> <NA>
2 2    b    a <NA>
3 3    c    b <NA>
4 4 <NA>    c <NA>
5 5 <NA> <NA>    a
6 6 <NA> <NA>    b
7 7 <NA> <NA>    c
  • Related