Home > OS >  how to remove rows in df1 present in df2?
how to remove rows in df1 present in df2?

Time:10-12

Assume I have two dataframes

df1 <- data.frame (name = c("Mike", "Paul", "Paul", "Henry"),
                   age = c(20, 21, 22, 23))

df2 <- data.frame (name = c("Sam", "Paul", "Paul", "Bob"),
                   age = c(26, 30, 22, 23))

I would like to remove row 3 from df1, because this row is also present in df2

What is the most elegant way to do this in R?

CodePudding user response:

Using setdiff from dplyr

library(dplyr)
setdiff(df1, df2)
   name age
1  Mike  20
2  Paul  21
3 Henry  23

If it is based on subset of column names that are common, use anti_join

anti_join(df1, df2)

In this example, all the columns are common, so by default, it uses by as the full column names. If we want a subset, specify it in by

anti_join(df1, df2, by = c('name'))
  •  Tags:  
  • r
  • Related