Home > database >  How do I remove .x.x.x.... and .y.y.y... from headings when I do a full join of multiple dataframes
How do I remove .x.x.x.... and .y.y.y... from headings when I do a full join of multiple dataframes

Time:01-13

file1

Country Name Race A B C
.. .. .. .. .. ..

file2

Country Name Race D E F
.. .. .. .. .. ..

file3

Country Name Race G H I
.. .. .. .. .. ..

file4

Country Name Race J K L
.. .. .. .. .. ..

The above are a some .csv data frames that I assigned to a variable called file.list, and then used lapply. The purpose was to full join every data frame into a single one, as you can see by the code below.

file.list= c(file1.csv, file2.csv, file3.csv, file4.csv)

df.list <- lapply(file.list, read.csv)

data <-df.list %>% reduce(full_join, by=c("Country", "Name", "Race"))

data

Country Name Race A.x B.x C.x D.y E.y F.y G.x.x H.x.x I.x.x J.y.y K.y.y L.y.y
.. .. .. .. .. .. .. .. .. .. .. .. .. .. ..

Although the full join works using the above code, the headers have a .x or .y added behind their names. This code also produced similar results:

data<-power_full_join(df.list, by=c("Country", "Name", "Race"))

How can I full join such that the headers retain their original names without the .x.x... and .y.y.. behind?

CodePudding user response:

full_join() has an argument suffix which can be set to empty strings to achieve this.

data <-df.list %>% reduce(full_join, by=c("Country", "Name", "Race"), suffix=c("",""))

CodePudding user response:

A base R alternative joining a list of data frames.

Example

df_list
[[1]]
  Country Name Race A B C
1      Rf  wef  wed 1 1 1
2      Ew eggw   qw 2 2 2

[[2]]
  Country Name Race D E F
1      Gw  wef  wed 3 5 7
2      Wd eggw   qw 4 6 8

[[3]]
  Country Name Race G  H I
1      Qp  wef  wed 3 10 7
2      Ew eggw   qw 4 11 8

Function

join_list <- function(x, ax = T, ay = F){ 
  dff <- merge(x[[1]], x[[2]], all.x=ax, all.y=ay)
  if(length(df_list) > 2){
    for(i in seq_along(x)[3:length(x)]){ 
      dff <- merge(dff, x[[i]], all.x=ax, all.y=ay) 
  }}; dff }

use

join_list(df_list, ax=T, ay=T)
  Country Name Race  A  B  C  D  E  F  G  H  I
1      Ew eggw   qw  2  2  2 NA NA NA  4 11  8
2      Gw  wef  wed NA NA NA  3  5  7 NA NA NA
3      Qp  wef  wed NA NA NA NA NA NA  3 10  7
4      Rf  wef  wed  1  1  1 NA NA NA NA NA NA
5      Wd eggw   qw NA NA NA  4  6  8 NA NA NA

Data

df_list <- list(structure(list(Country = c("Rf", "Ew"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), A = 1:2, B = 1:2, C = 1:2), class = "data.frame", row.names = c(NA,
-2L)), structure(list(Country = c("Gw", "Wd"), Name = c("wef",
"eggw"), Race = c("wed", "qw"), D = 3:4, E = 5:6, F = 7:8), row.names = c(NA,
-2L), class = "data.frame"), structure(list(Country = c("Qp",
"Ew"), Name = c("wef", "eggw"), Race = c("wed", "qw"), G = 3:4,
    H = 10:11, I = 7:8), row.names = c(NA, -2L), class = "data.frame"))

CodePudding user response:

We could use plyr::join_all

plyr::join_all(df_list, type = "full")

-output

  Country Name Race  A  B  C  D  E  F  G  H  I
1      Rf  wef  wed  1  1  1 NA NA NA NA NA NA
2      Ew eggw   qw  2  2  2 NA NA NA  4 11  8
3      Gw  wef  wed NA NA NA  3  5  7 NA NA NA
4      Wd eggw   qw NA NA NA  4  6  8 NA NA NA
5      Qp  wef  wed NA NA NA NA NA NA  3 10  7

data

df_list <- list(structure(list(Country = c("Rf", "Ew"), Name = c("wef", 
"eggw"), Race = c("wed", "qw"), A = 1:2, B = 1:2, C = 1:2), class = "data.frame", row.names = c(NA, 
-2L)), structure(list(Country = c("Gw", "Wd"), Name = c("wef", 
"eggw"), Race = c("wed", "qw"), D = 3:4, E = 5:6, F = 7:8), row.names = c(NA, 
-2L), class = "data.frame"), structure(list(Country = c("Qp", 
"Ew"), Name = c("wef", "eggw"), Race = c("wed", "qw"), G = 3:4, 
    H = 10:11, I = 7:8), row.names = c(NA, -2L), class = "data.frame"))
  •  Tags:  
  • r
  • Related