I want to use slice_sample() based on sample size. But it seems that we cannot use n() in case_when. How can we slice_sample condition on counts?
library(dplyr)
dat <- mtcars |> group_by(cyl) |> mutate(counts = n()) |>
slice_sample(n = case_when(n() > 10 ~ 10, TRUE ~ n()))
Error in `slice_sample()`:
! `n` must be a constant.
Caused by error in `n()`:
! Must be used inside dplyr verbs.
Run `rlang::last_error()` to see where the error occurred.
CodePudding user response:
You can use an if else
per group with slice
like this:
library(dplyr)
mtcars |>
group_by(cyl) |>
mutate(counts = n()) |>
slice(if(n() > 10) sample(1:n(), 10) else 1:n())
Output:
# A tibble: 27 × 12
# Groups: cyl [3]
mpg cyl disp hp drat wt qsec vs am gear carb counts
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2 11
2 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1 11
3 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2 11
4 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1 11
5 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 11
6 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2 11
7 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 11
8 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1 11
9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 11
10 26 4 120. 91 4.43 2.14 16.7 0 1 5 2 11
# … with 17 more rows
Please note: The counts column is not necessary to slice
the data.
CodePudding user response:
I don't think you need any if
or case_when
:
set.seed(42)
mtcars |>
group_by(cyl) |>
slice(n = head(sample(row_number()), 10))
# # A tibble: 27 x 11
# # Groups: cyl [3]
# mpg cyl disp hp drat wt qsec vs am gear carb
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
# 2 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2
# 3 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2
# 4 26 4 120. 91 4.43 2.14 16.7 0 1 5 2
# 5 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
# 6 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
# 7 21.5 4 120. 97 3.7 2.46 20.0 1 0 3 1
# 8 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
# 9 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1
# 10 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1
# # ... with 17 more rows