Home > Back-end >  Filter according to two conditions that are in dynamic dependence
Filter according to two conditions that are in dynamic dependence

Time:06-04

I'm having a large dataset with a value called CPE and a percentage PERC. In real world it's slightly different, but I reduced it to the minimum necessary.

| CPE|      PERC|
|---:|---------:|
|  42| 0.1132664|
| 264| 0.9260718|
| 470| 0.3732287|
| 316| 0.7437126|
|   9| 0.5819554|
| 114| 0.2052649|

Now I want to filter. Smaller amounts of CPE require a higher PERC to get in focus.

Surely I could do it this way:

df[(CPE > 20 & PERC > 0.95) |
     (CPE > 50 & PERC > 0.9) |
     (CPE > 100 & PERC > 0.85) |
     (CPE > 250 & PERC > 0.8)]

But as there can be many more in real world and additionally they can vary from time to time, I'm looking for an easier and more dynamic solution.
Something like creating a list with all combinations and then using each combination as a filter-condition-pair.

list(c(20, 0.95), c(50, 0.9), c(100,0.85), c(250,0.8))

Is there a smart way to handle this? Even if I'm preferring data.table, dplyr is great as well. And it haven't be a list, I'm open for any similar approach.


MWE

library(data.table)

set.seed(33)

df = data.table(CPE=sample(1:500, 100), 
                PERC=runif(min = 0.1, max = 1, n=100))

CodePudding user response:

If the list is already created, we can loop over the list, create the logical condition with the first and second element using CPE and PERC columns respectively, Reduce to a single logical vector with | and subset the data

df[Reduce(`|`, lapply(lst1, \(x) CPE > x[1] & PERC > x[2]))]

data

lst1 <- list(c(20, 0.95), c(50, 0.9), c(100,0.85), c(250,0.8))

CodePudding user response:

library(dplyr, warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 4.1.2
library(purrr)

df <- structure(list(CPE = c(42L, 264L, 470L, 316L, 9L, 114L), PERC = c(0.1132664, 
0.9260718, 0.3732287, 0.7437126, 0.5819554, 0.2052649)), row.names = c(NA, 
-6L), class = "data.frame")

constraints <- structure(list(CPE = c(20, 50, 100, 250), PERC = c(0.95, 0.9, 
0.85, 0.8)), class = "data.frame", row.names = c(NA, -4L))

constraints
#>   CPE PERC
#> 1  20 0.95
#> 2  50 0.90
#> 3 100 0.85
#> 4 250 0.80

filt <- 
  map2(df, constraints, outer, '>') %>% 
    reduce(`&`) %>% 
    as.data.frame %>% 
    reduce(`|`)

df %>% 
  filter(filt)
#>   CPE      PERC
#> 1 264 0.9260718

Created on 2022-06-03 by the reprex package (v2.0.1)

  • Related