I want to summarize my data in different ways, specifically, I want to count how many values are greater or equal than a certain threshold.
I could easily do that with e.g.
library(tidyverse)
mtcars |>
summarize(test1 = sum(mpg > 15, na.rm = TRUE))
However, how could I use summarize with using several, dynamic such thresholds?
E.g. with an input like my_thresholds <- c(15, 20)
, I'd like to get the following ouptut:
test1 test2
1 26 14
I think one way could be using the thresholds as an argument in purrr::map
and then later on I just bind_cols the tow summaries. However, the summarize itself is already wrapped in another purrr::map, i.e. my input is actually a list of data frames and I want to get the summaries for each list element:
input data:
input_data <- mtcars |>
group_split(cyl)
And then my desired output would be one row per group.
One more note, the number of thresholds should also be dynamic, e.g. in one case I might have two thresholds, in another call I might have 5.
CodePudding user response:
What about something like this?
library(purrr)
input_data |>
map(\(gp) map_int(my_thresholds, \(x) sum(gp$mpg > x, na.rm = TRUE)))
output
[[1]]
[1] 11 11
[[2]]
[1] 7 3
[[3]]
[1] 8 0