Home > Back-end >  Merging two Data frames in R?
Merging two Data frames in R?

Time:07-16

I am new to R and facing this problem (it may be very basic so pardon me). I have looked through the left/right merges and data.table merges but haven't had success.

Dataset1: 
Fips     County    Date          Result 
21001    Adair     01-03-2004    0.008 
21175    Morgan    11-21-2002    0.076

Dataset2:
Fips     County    Date          Pre
21001    Adair     01-03-2004    2.08
21001    Adair     01-04-2004    0.076
21001    Adair     01-04-2004    4.76
21017    Bourbon   01-03-2004    0.067
21175    Morgan    11-21-2002    0.076
21175    Morgan    05-04-2004    1.54
21175    Morgan    07-04-2004    0.008
21175    Morgan    11-21-2006    0.076

Desired_output:
Fips     County    Date          Result      Pre
21001    Adair     01-03-2004    0.008       2.08
21175    Morgan    11-21-2002    0.076       0.076

This is the code I am using:

Desired_output <- merge(Dataset1, Dataset2, by.x=c("Fips", "Date"), by.y=c("Fips", "Date"))

But it isn't giving me the desired output. It's just combining everything in the two datasets together?

Any help will be appreciated!

CodePudding user response:

If I've understood the question then this should work:

library(dplyr)
left_join(Dataset1, Dataset2, by=c("Fips", "County", "Date"))

There is a description of the different types of joins here:

CodePudding user response:

You can use an inner_join to get your desired result:

inner_join(): includes all rows in x and y.

Code:

Dataset1 <- read.table(text ="Fips     County    Date          Result 
21001    Adair     01-03-2004    0.008 
21175    Morgan    11-21-2002    0.076", header = TRUE)

Dataset2 <- read.table(text = "Fips     County    Date          Pre
21001    Adair     01-03-2004    2.08
21001    Adair     01-04-2004    0.076
21001    Adair     01-04-2004    4.76
21017    Bourbon   01-03-2004    0.067
21175    Morgan    11-21-2002    0.076
21175    Morgan    05-04-2004    1.54
21175    Morgan    07-04-2004    0.008
21175    Morgan    11-21-2006    0.076", header = TRUE)

library(dplyr)
inner_join(Dataset1, Dataset2)
#> Joining, by = c("Fips", "County", "Date")
#>    Fips County       Date Result   Pre
#> 1 21001  Adair 01-03-2004  0.008 2.080
#> 2 21175 Morgan 11-21-2002  0.076 0.076

Created on 2022-07-16 by the reprex package (v2.0.1)

  • Related