Home > OS >  How cbind data.frames in R using the IDs in a specific column
How cbind data.frames in R using the IDs in a specific column

Time:10-29

I have 2 dataframes:

df1

    Taxa        Env  Correlation
1  C1161         pH  -0.209916044
2  C1161         pH   0.101338976
3  C1161       Temp  -0.228451375
4  C1161       Temp  -0.218456646
5  C1161         TS   0.255112839
6   C26         NH4   0.379192859
7   C26        Prot   0.327016026
8   C26        Prot   0.602990615
9   C26       Carbo  -0.102919129
10  C26       Carbo   0.481216962
11 C1815         pH  -0.403348271
12 C1815         pH   0.126527189
13 C1815       Temp  -0.125038666
14 C1815       Temp  -0.343674237

df2

       Domain                Phylum
C1161 Bacteria        Actinobacteria
C1714 Bacteria        Actinobacteria
C26   Bacteria         Bacteroidetes
C895  Bacteria            Firmicutes
C1020 Bacteria            Firmicutes
C1815 Bacteria unclassified_Bacteria
C26   Bacteria            Firmicutes
C1620 Bacteria            Firmicutes
C822  Bacteria            Firmicutes

I want to cbind both dataframes using the IDs in Taxa column of the df1 and merge with df2 using the rownames.

My problem is that I can't use the rownames of the df1 because the IDs in Taxa column could be present 2 o more times.

I just want something like:

    Taxa        Env  Correlation    Domain          Phylum
1  C1161         pH  -0.209916044   Bacteria        Actinobacteria
2  C1161         pH   0.101338976   Bacteria        Actinobacteria
3  C1161       Temp  -0.228451375   Bacteria        Actinobacteria
4  C1161       Temp  -0.218456646   Bacteria        Actinobacteria
5  C1161         TS   0.255112839   Bacteria        Actinobacteria
6   C26         NH4   0.379192859   Bacteria            Firmicutes
7   C26        Prot   0.327016026   Bacteria            Firmicutes
8   C26        Prot   0.602990615   Bacteria            Firmicutes
9   C26       Carbo  -0.102919129   Bacteria            Firmicutes
10  C26       Carbo   0.481216962   Bacteria            Firmicutes
11 C1815         pH  -0.403348271   Bacteria unclassified_Bacteria
12 C1815         pH   0.126527189   Bacteria unclassified_Bacteria
13 C1815       Temp  -0.125038666   Bacteria unclassified_Bacteria
14 C1815       Temp  -0.343674237   Bacteria unclassified_Bacteria

I tried:

 cbind(df1$Taxa, df2)

 merge(rownames(df2), df1, by = "Taxa")

Thanks

CodePudding user response:

library(tibble)
library(dplyr)

left_join(df1, rownames_to_column(df2, "Taxa"), by = "Taxa")

In base R:

df2$Taxa <- rownames(df2)

merge(df1, 
      df2,
      all.x = T,
      by = "Taxa")

Note: I deleted the first instance of C26 from df2 based on the output you provided.

Output

    Taxa   Env Correlation   Domain                Phylum
1  C1161    pH  -0.2099160 Bacteria        Actinobacteria
2  C1161    pH   0.1013390 Bacteria        Actinobacteria
3  C1161  Temp  -0.2284514 Bacteria        Actinobacteria
4  C1161  Temp  -0.2184566 Bacteria        Actinobacteria
5  C1161    TS   0.2551128 Bacteria        Actinobacteria
6    C26   NH4   0.3791929 Bacteria            Firmicutes
7    C26  Prot   0.3270160 Bacteria            Firmicutes
8    C26  Prot   0.6029906 Bacteria            Firmicutes
9    C26 Carbo  -0.1029191 Bacteria            Firmicutes
10   C26 Carbo   0.4812170 Bacteria            Firmicutes
11 C1815    pH  -0.4033483 Bacteria unclassified_Bacteria
12 C1815    pH   0.1265272 Bacteria unclassified_Bacteria
13 C1815  Temp  -0.1250387 Bacteria unclassified_Bacteria
14 C1815  Temp  -0.3436742 Bacteria unclassified_Bacteria
  • Related