is it possible to filter rows of one dataframe based on another dataframe?
I have this 2 dataframe:
df_node <- data.frame( id= c("a","b","c","d","e","f","g","h","i"),
group= c(1,1,1,2,2,2,3,3,3))
df_link <- data.frame(from = c("a","d","f","i","b"),
to = c("d","f","i","b","h"))
I would like to delete the lines with characters that are not present in the second dataframe, like this:
CodePudding user response:
here is a basic way to do that:
df_node <- data.frame( id= c("a","b","c","d","e","f","g","h","i"),
group= c(1,1,1,2,2,2,3,3,3))
df_link <- data.frame(from = c("a","d","f","i","b"),
to = c("d","f","i","b","h"))
library(dplyr)
df_result <- df_node%>%
filter(id%in%c(df_link$from,df_link$to))
df_result
# > df_result
# id group
# 1 a 1
# 2 b 1
# 3 d 2
# 4 f 2
# 5 h 3
# 6 i 3
CodePudding user response:
We could use a semi_join
:
library(dplyr)
df_node |>
semi_join(tibble(id = c(df_link$from, df_link$to)))
Output:
id group
1 a 1
2 b 1
3 d 2
4 f 2
5 h 3
6 i 3
CodePudding user response:
Here is a oneliner with base R:
df_node[df_node$id %in% unlist(df_link),]
id group
1 a 1
2 b 1
4 d 2
6 f 2
8 h 3
9 i 3
But you could also use a join:
library(dplyr)
df_uniqueID <- data.frame(id = unique(c(df_link$from,df_link$to)) )
right_join(df_node,df_uniqueID)
Joining, by = "id"
id group
1 a 1
2 b 1
3 d 2
4 f 2
5 h 3
6 i 3