Home > Blockchain >  Finding unique rows that are NOT between an interval
Finding unique rows that are NOT between an interval

Time:10-01

I'm trying to find a way to filter a data set so that I see only the rows that do NOT have a measurement in a particular interval. For some reason my brain is cannot seem to put the logic together. I've created an example dataset below to try and explain my thinking

library(dplyr)

df <- data.frame (id  = c(1,1,1,1,1,1,1,1,2,2,2,2,2, 3, 3),
                  number = c(-10, -9, -8, -1, -0.5, 0.0, 0.23, 5, -2, -1.1, -.88, 1.2, 4, -10,10))
                  )

df

So here, ideally, I want to find the unique id's that do NOT have values in between -1 and 0. ID 1 and ID 2 both have values in between -1 and 0, so they would not be included.

df %>% filter(between(number, -1, 0))

But ID 3 only has measurements of -10 and 10, so that ID does not have measures in between the interval of -1 to 0. I'm trying to get that as my final output (the 2 rows with ID 3). But can't think of a way to achieve that.

Thanks in advance!

CodePudding user response:

You could use group_by and filter the groups with all values not in specific range like this:

library(dplyr)

df <- data.frame (id  = c(1,1,1,1,1,1,1,1,2,2,2,2,2, 3, 3),
                  number = c(-10, -9, -8, -1, -0.5, 0.0, 0.23, 5, -2, -1.1, -.88, 1.2, 4, -10,10))

df %>% 
  group_by(id) %>%
  filter(all(!between(number, -1, 0)))
#> # A tibble: 2 × 2
#> # Groups:   id [1]
#>      id number
#>   <dbl>  <dbl>
#> 1     3    -10
#> 2     3     10

Created on 2022-09-30 with reprex v2.0.2

CodePudding user response:

df %>% group_by(id) %>% filter(!any(between(number, -1, 0)))
  • Related