I'm trying to find a way to filter a data set so that I see only the rows that do NOT have a measurement in a particular interval. For some reason my brain is cannot seem to put the logic together. I've created an example dataset below to try and explain my thinking
library(dplyr)
df <- data.frame (id = c(1,1,1,1,1,1,1,1,2,2,2,2,2, 3, 3),
number = c(-10, -9, -8, -1, -0.5, 0.0, 0.23, 5, -2, -1.1, -.88, 1.2, 4, -10,10))
)
df
So here, ideally, I want to find the unique id's that do NOT have values in between -1 and 0. ID 1 and ID 2 both have values in between -1 and 0, so they would not be included.
df %>% filter(between(number, -1, 0))
But ID 3 only has measurements of -10 and 10, so that ID does not have measures in between the interval of -1 to 0. I'm trying to get that as my final output (the 2 rows with ID 3). But can't think of a way to achieve that.
Thanks in advance!
CodePudding user response:
You could use group_by
and filter
the groups with all
values not in specific range like this:
library(dplyr)
df <- data.frame (id = c(1,1,1,1,1,1,1,1,2,2,2,2,2, 3, 3),
number = c(-10, -9, -8, -1, -0.5, 0.0, 0.23, 5, -2, -1.1, -.88, 1.2, 4, -10,10))
df %>%
group_by(id) %>%
filter(all(!between(number, -1, 0)))
#> # A tibble: 2 × 2
#> # Groups: id [1]
#> id number
#> <dbl> <dbl>
#> 1 3 -10
#> 2 3 10
Created on 2022-09-30 with reprex v2.0.2
CodePudding user response:
df %>% group_by(id) %>% filter(!any(between(number, -1, 0)))