I'm having a large dataset with a value called CPE
and a percentage PERC
. In real world it's slightly different, but I reduced it to the minimum necessary.
| CPE| PERC|
|---:|---------:|
| 42| 0.1132664|
| 264| 0.9260718|
| 470| 0.3732287|
| 316| 0.7437126|
| 9| 0.5819554|
| 114| 0.2052649|
Now I want to filter. Smaller amounts of CPE
require a higher PERC
to get in focus.
Surely I could do it this way:
df[(CPE > 20 & PERC > 0.95) |
(CPE > 50 & PERC > 0.9) |
(CPE > 100 & PERC > 0.85) |
(CPE > 250 & PERC > 0.8)]
But as there can be many more in real world and additionally they can vary from time to time, I'm looking for an easier and more dynamic solution.
Something like creating a list with all combinations and then using each combination as a filter-condition-pair.
list(c(20, 0.95), c(50, 0.9), c(100,0.85), c(250,0.8))
Is there a smart way to handle this? Even if I'm preferring data.table
, dplyr
is great as well. And it haven't be a list, I'm open for any similar approach.
MWE
library(data.table)
set.seed(33)
df = data.table(CPE=sample(1:500, 100),
PERC=runif(min = 0.1, max = 1, n=100))
CodePudding user response:
If the list
is already created, we can loop over the list
, create the logical condition with the first and second element using CPE
and PERC
columns respectively, Reduce
to a single logical vector with |
and subset the data
df[Reduce(`|`, lapply(lst1, \(x) CPE > x[1] & PERC > x[2]))]
data
lst1 <- list(c(20, 0.95), c(50, 0.9), c(100,0.85), c(250,0.8))
CodePudding user response:
library(dplyr, warn.conflicts = FALSE)
#> Warning: package 'dplyr' was built under R version 4.1.2
library(purrr)
df <- structure(list(CPE = c(42L, 264L, 470L, 316L, 9L, 114L), PERC = c(0.1132664,
0.9260718, 0.3732287, 0.7437126, 0.5819554, 0.2052649)), row.names = c(NA,
-6L), class = "data.frame")
constraints <- structure(list(CPE = c(20, 50, 100, 250), PERC = c(0.95, 0.9,
0.85, 0.8)), class = "data.frame", row.names = c(NA, -4L))
constraints
#> CPE PERC
#> 1 20 0.95
#> 2 50 0.90
#> 3 100 0.85
#> 4 250 0.80
filt <-
map2(df, constraints, outer, '>') %>%
reduce(`&`) %>%
as.data.frame %>%
reduce(`|`)
df %>%
filter(filt)
#> CPE PERC
#> 1 264 0.9260718
Created on 2022-06-03 by the reprex package (v2.0.1)