Home > OS >  Combine dataframes only by mutual rownames
Combine dataframes only by mutual rownames

Time:07-20

I want to combine about 20 dataframes, with different lengths of rows and columns, only by the mutual rownames. Any rows that are not shared for ALL dataframes are deleted. So for example on two dataframes:

     Patient1   Patient64   Patient472
ABC     28          38          0
XYZ     92          11          998
WWE      1          10          282
ICQ      0          76          56
SQL     22          1002        778

combine with

     Pat_9   Pat_1   Pat_111
ABC    65      44       874
CBA    3       311      998
WWE    2       1110     282
vVv    2       760      56
GHG    12      1200     778

The result would be

   Patient1 Patient64 Patient472  Pat_9  Pat_1  Pat_111
ABC    28      38        0         65     44      874
WWE    1       10       282         2     1110    282

I know how to use rbind and cbind but not for the purpose of joining according to shared rownames.

CodePudding user response:

Try this considering change list arguments to df1 , df2 , df3 , ... , df20 your data.frames

l <- lapply(list(df1 , df2 ) , \(x) {x[["id"]] <- rownames(x) ; x})

Reduce(\(x,y) merge(x,y , by = "id") , l)

CodePudding user response:

you can try

merge(d1, d2, by = "row.names")
  Row.names Patient1 Patient64 Patient472 Pat_9 Pat_1 Pat_111
1       ABC       28        38          0    65    44     874
2       WWE        1        10        282     2  1110     282

CodePudding user response:

You can first turn your rownames_to_column and use a inner_join and at last convert column_to_rownames back like this:

df1 <- read.table(text="     Patient1   Patient64   Patient472
ABC     28          38          0
XYZ     92          11          998
WWE      1          10          282
ICQ      0          76          56
SQL     22          1002        778", header = TRUE)

df2 <- read.table(text = "     Pat_9   Pat_1   Pat_111
ABC    65      44       874
CBA    3       311      998
WWE    2       1110     282
vVv    2       760      56
GHG    12      1200     778", header = TRUE)

library(dplyr) 
library(tibble)
df1 %>%
  rownames_to_column() %>%
  inner_join(df2 %>% rownames_to_column(), by = "rowname") %>%
  column_to_rownames()
#>     Patient1 Patient64 Patient472 Pat_9 Pat_1 Pat_111
#> ABC       28        38          0    65    44     874
#> WWE        1        10        282     2  1110     282

Created on 2022-07-20 by the reprex package (v2.0.1)

Option with list of dataframes:

dfs_list <- list(df1, df2)
transform(Reduce(merge, lapply(dfs_list, function(x) data.frame(x, rn = row.names(x)))), row.names=rn, rn=NULL)
#>     Patient1 Patient64 Patient472 Pat_9 Pat_1 Pat_111
#> ABC       28        38          0    65    44     874
#> WWE        1        10        282     2  1110     282

Created on 2022-07-20 by the reprex package (v2.0.1)

  • Related