Home > front end >  How to remove a row from a dataframe if a value does not exist in another dataframe in R?
How to remove a row from a dataframe if a value does not exist in another dataframe in R?

Time:09-08

I have two dataframes df_description and df_users. First one has a column named users. The second dataframe has two columns, user_1 and user_2. Ideally, values in both columns of df_users must match the values in users column of df_description.

If a value does not exist in df_description, the corresponding row of that value should be removed from df_users.

Following is the example:

#df_descirption
users
Adam
Micheal
George

And

#df_users
user_1   user_2
Adam     George
Adam     Micheal
George   Elizabeth #since Elizabeth does not exist in df_descirption, this row should be removed

The final df_users should look something like this:

#df_users
user_1   user_2
Adam     George
Adam     Micheal

CodePudding user response:

In Base R, try:

df_users[df_users$user_1 %in% df_descirption$users &
           df_users$user_2 %in% df_descirption$users, ]

Output:

#   user_1  user_2
#1    Adam  George
#2    Adam Micheal

CodePudding user response:

Use filter with if_all:

library(dplyr)
df2 %>% 
  filter(if_all(c(user_1, user_2), ~ .x %in% df1$users))

output

  user_1  user_2
1   Adam  George
2   Adam Micheal
  •  Tags:  
  • r
  • Related