I have a dataframe named "data", as below :
id | quantity |
---|---|
01 | 5 |
02 | 3 |
03 | 7 |
04 | 4 |
05 | 9 |
and I would like to set thresholds and count how many ids are equal or below that threshold, which means a dataframe "results" like :
threshold | count |
---|---|
1 | 0 |
2 | 0 |
3 | 1 |
4 | 2 |
5 | 3 |
6 | 3 |
7 | 4 |
8 | 4 |
9 | 5 |
10 | 5 |
The only way I found to do this is to use "for" :
for(i in 1:10) {results$count[i] <- nrow(data[data$quantity <= i,]}
This instruction does work. However, with my real data, there are 500 thresholds and I have to repeat quite the same process 12 times... thus the "for" loop is very long to proceed. I couldn't find something to replace that, I would rather something like :
results$count <- nrow(data[data$quantity <= results$threshold,]
but it doesn't work ("longer object length is not a multiple of shorter object length"). Do you have some ideas?
CodePudding user response:
Try this:
library(tidyverse)
df <- tribble(
~id, ~quantity,
"01", 5,
"02", 3,
"03", 7,
"04", 4,
"05", 9
)
result <- map_dfr(1:10, function(x){
tibble(
threshold = x,
count = sum(df$quantity <= x)
)
})
result
#> # A tibble: 10 × 2
#> threshold count
#> <int> <int>
#> 1 1 0
#> 2 2 0
#> 3 3 1
#> 4 4 2
#> 5 5 3
#> 6 6 3
#> 7 7 4
#> 8 8 4
#> 9 9 5
#> 10 10 5
Created on 2022-07-06 by the reprex package (v2.0.1)