Home > Software engineering >  How to find matches between two dataframe rows on multiple columns? [duplicate]
How to find matches between two dataframe rows on multiple columns? [duplicate]

Time:09-24

Assume I have two dataframes

df1 <- data.frame (name = c("Mike", "Paul", "Paul", "Henry"),
                   age = c(20, 21, 22, 23))

df2 <- data.frame (name = c("Sam", "Paul", "Paul", "Bob"),
                   age = c(26, 30, 22, 23))

I would like to find which rows present in df1 have an identical match in df2 (in this case there will be one such match: "Paul", 22).

What is the most elegant way to do this in R?

CodePudding user response:

merge(df1, df2) should give you the common matches between two dataframes. merge by default joins on common columns in both the dataframes. If you have other columns in the data specify the joining criteria in by i.e

merge(df1, df2, by = c('name', 'age'))

To count the matching rows you may use nrow.

nrow(merge(df1, df2, by = c('name', 'age')))
#[1] 1

CodePudding user response:

Or with dplyr inner_join, which I believe is faster than merge:

library(dplyr)
df1 %>% inner_join(., df2, by = c("name", "age")) %>% count()
  •  Tags:  
  • r
  • Related