file1
Country | Name | Race | A | B | C |
---|---|---|---|---|---|
.. | .. | .. | .. | .. | .. |
file2
Country | Name | Race | D | E | F |
---|---|---|---|---|---|
.. | .. | .. | .. | .. | .. |
file3
Country | Name | Race | G | H | I |
---|---|---|---|---|---|
.. | .. | .. | .. | .. | .. |
file4
Country | Name | Race | J | K | L |
---|---|---|---|---|---|
.. | .. | .. | .. | .. | .. |
The above are a some .csv data frames that I assigned to a variable called file.list
, and then used lapply
. The purpose was to full join every data frame into a single one, as you can see by the code below.
file.list= c(file1.csv, file2.csv, file3.csv, file4.csv)
df.list <- lapply(file.list, read.csv)
data <-df.list %>% reduce(full_join, by=c("Country", "Name", "Race"))
data
Country | Name | Race | A.x | B.x | C.x | D.y | E.y | F.y | G.x.x | H.x.x | I.x.x | J.y.y | K.y.y | L.y.y |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
.. | .. | .. | .. | .. | .. | .. | .. | .. | .. | .. | .. | .. | .. | .. |
Although the full join works using the above code, the headers have a .x or .y added behind their names. This code also produced similar results:
data<-power_full_join(df.list, by=c("Country", "Name", "Race"))
How can I full join such that the headers retain their original names without the .x.x... and .y.y.. behind?
CodePudding user response:
full_join()
has an argument suffix
which can be set to empty strings to achieve this.
data <-df.list %>% reduce(full_join, by=c("Country", "Name", "Race"), suffix=c("",""))
CodePudding user response:
A base R alternative joining a list of data frames.
Example
df_list
[[1]]
Country Name Race A B C
1 Rf wef wed 1 1 1
2 Ew eggw qw 2 2 2
[[2]]
Country Name Race D E F
1 Gw wef wed 3 5 7
2 Wd eggw qw 4 6 8
[[3]]
Country Name Race G H I
1 Qp wef wed 3 10 7
2 Ew eggw qw 4 11 8
Function
join_list <- function(x, ax = T, ay = F){
dff <- merge(x[[1]], x[[2]], all.x=ax, all.y=ay)
if(length(df_list) > 2){
for(i in seq_along(x)[3:length(x)]){
dff <- merge(dff, x[[i]], all.x=ax, all.y=ay)
}}; dff }
use
join_list(df_list, ax=T, ay=T)
Country Name Race A B C D E F G H I
1 Ew eggw qw 2 2 2 NA NA NA 4 11 8
2 Gw wef wed NA NA NA 3 5 7 NA NA NA
3 Qp wef wed NA NA NA NA NA NA 3 10 7
4 Rf wef wed 1 1 1 NA NA NA NA NA NA
5 Wd eggw qw NA NA NA 4 6 8 NA NA NA
Data
df_list <- list(structure(list(Country = c("Rf", "Ew"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), A = 1:2, B = 1:2, C = 1:2), class = "data.frame", row.names = c(NA,
-2L)), structure(list(Country = c("Gw", "Wd"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), D = 3:4, E = 5:6, F = 7:8), row.names = c(NA,
-2L), class = "data.frame"), structure(list(Country = c("Qp",
"Ew"), Name = c("wef", "eggw"), Race = c("wed", "qw"), G = 3:4,
H = 10:11, I = 7:8), row.names = c(NA, -2L), class = "data.frame"))
CodePudding user response:
We could use plyr::join_all
plyr::join_all(df_list, type = "full")
-output
Country Name Race A B C D E F G H I
1 Rf wef wed 1 1 1 NA NA NA NA NA NA
2 Ew eggw qw 2 2 2 NA NA NA 4 11 8
3 Gw wef wed NA NA NA 3 5 7 NA NA NA
4 Wd eggw qw NA NA NA 4 6 8 NA NA NA
5 Qp wef wed NA NA NA NA NA NA 3 10 7
data
df_list <- list(structure(list(Country = c("Rf", "Ew"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), A = 1:2, B = 1:2, C = 1:2), class = "data.frame", row.names = c(NA,
-2L)), structure(list(Country = c("Gw", "Wd"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), D = 3:4, E = 5:6, F = 7:8), row.names = c(NA,
-2L), class = "data.frame"), structure(list(Country = c("Qp",
"Ew"), Name = c("wef", "eggw"), Race = c("wed", "qw"), G = 3:4,
H = 10:11, I = 7:8), row.names = c(NA, -2L), class = "data.frame"))