I have two dataframes with different number of rows.
df1
is longer than df2
, they both share several common rows.
My example
df1 <- data.frame(name1 = "a", "b", "c",
name2 = "a1","b1","c1",
name3 = "a2","b2","c2")
df1
name1 name2 name3
1 a a1 a2
2 b b1 b2
3 c c1 c2
df2 <- data.frame(name1 = c("a", "b", "m"),
name2 = c("a3","b3", "m1"),
name3 = c("a4", "b4", "m2"))
df2
name1 name2 name3
1 a a3 a4
2 b b3 b4
3 m m1 m2
I would like to exclude the common rows in two dataframe and only keep one row of df2
in this case using tidyverse
. Any suggestion for this?
Desired output
name1 name2 name3
m m1 m2
CodePudding user response:
anti_join(df1, df2, by = "name1")
name1 name2 name3
1 c c1 c2
anti_join(df2, df1, by = "name1")
name1 name2 name3
1 m m1 m2
CodePudding user response:
We may use anti_join
(originally posted as comments way before the other answer was posted)
library(dplyr)
anti_join(df1, df2, by = c("name1"))
data
df1 <- structure(list(name1 = c("a", "b", "c"), name2 = c("a1", "b1",
"c1"), name3 = c("a2", "b2", "c2")), class = "data.frame", row.names = c(NA,
-3L))
df2 <- structure(list(name1 = c("a", "b"), name2 = c("a3", "b3")), class = "data.frame", row.names = c(NA,
-2L))