Suppose I have the following dataset test
> test = data.frame(location = c("here", "there", "here", "there", "where"), x = 1:5, y = c(6,7,8,6,10))
> test
location x y
1 here 1 6
2 there 2 7
3 here 3 8
4 there 4 6
5 where 5 10
Then, I want to make a condition where if y
satisfy a condition, every location matched once are maintained in the dataset, something like
test %>% filter_something(y == 6)
location x y
1 here 1 6
2 there 2 7
3 here 3 8
4 there 4 6
Note that, even in line 4 there is no y = 6
, they keep on the dataset, since there is at least one case where location
match the 'right' y
.
I can solve this problem creating another dataset using y == 6
, and then doing an inner join with test
, but any hint if there is another option more elegant?, because I'm not filtering just this variable, but I'm using another columns too.
CodePudding user response:
We can group_by location, then use any(condition)
library(dplyr)
test %>% group_by(location) %>%
filter(any(y==6))
CodePudding user response:
If we want to use data.table, we could first get the locations associated with y ==6, and filter on those, all in one line:
library(data.table)
test <- setDT(test)
# keep only the locations associated with y == 6
test <- test[location %in% test[y==6]$location]