I have a data frame that I am trying to filter based on 2 ID columns from the first data frame. Here's data frame 1:
id_1 | id_2 |
---|---|
A | 1 |
B | 2 |
C | 3 |
And data frame 2:
id_1 | id_2 |
---|---|
A | 1 |
B | 2 |
A | 2 |
If I do something like...
df2_filtered <- df2 %>%
filter(id_1 %in% df1$id_1 &
id_2 %in% df1$id_2)
Then I get back the entirety of df2, which is not what I want. Even though "A" is in id_1 of df1, and "2" is in id_2 of df1, there is no row that has both. How do I fix it so that I only get back
id_1 | id_2 |
---|---|
A | 1 |
B | 2 |
CodePudding user response:
If you aren't bound to a dplyr
solution, then data.table
has a nice option:
library(data.table)
df1 = as.data.table(df1)
df2 = as.data.table(df2)
fintersect(df1, df2)
id_1 id_2
1: A 1
2: B 2
CodePudding user response:
You can use dplyr::inner_join
:
inner_join(data1, data2)
# Joining, by = c("id_1", "id_2")
# id_1 id_2
# 1 A 1
# 2 B 2