Home > Software design >  using n() in case_when in dplyr
using n() in case_when in dplyr

Time:06-22

I want to use slice_sample() based on sample size. But it seems that we cannot use n() in case_when. How can we slice_sample condition on counts?

library(dplyr)
dat <- mtcars |> group_by(cyl) |> mutate(counts = n()) |>
  slice_sample(n = case_when(n() > 10 ~ 10, TRUE ~ n()))

Error in `slice_sample()`:
! `n` must be a constant.
Caused by error in `n()`:
! Must be used inside dplyr verbs.
Run `rlang::last_error()` to see where the error occurred.

CodePudding user response:

You can use an if else per group with slice like this:

library(dplyr)
mtcars |> 
  group_by(cyl) |> 
  mutate(counts = n()) |>
  slice(if(n() > 10) sample(1:n(), 10) else 1:n())

Output:

# A tibble: 27 × 12
# Groups:   cyl [3]
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb counts
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <int>
 1  21.4     4 121     109  4.11  2.78  18.6     1     1     4     2     11
 2  32.4     4  78.7    66  4.08  2.2   19.5     1     1     4     1     11
 3  30.4     4  95.1   113  3.77  1.51  16.9     1     1     5     2     11
 4  27.3     4  79      66  4.08  1.94  18.9     1     1     4     1     11
 5  22.8     4 108      93  3.85  2.32  18.6     1     1     4     1     11
 6  30.4     4  75.7    52  4.93  1.62  18.5     1     1     4     2     11
 7  24.4     4 147.     62  3.69  3.19  20       1     0     4     2     11
 8  33.9     4  71.1    65  4.22  1.84  19.9     1     1     4     1     11
 9  22.8     4 141.     95  3.92  3.15  22.9     1     0     4     2     11
10  26       4 120.     91  4.43  2.14  16.7     0     1     5     2     11
# … with 17 more rows

Please note: The counts column is not necessary to slice the data.

CodePudding user response:

I don't think you need any if or case_when:

set.seed(42)
mtcars |>
  group_by(cyl) |>
  slice(n = head(sample(row_number()), 10))
# # A tibble: 27 x 11
# # Groups:   cyl [3]
#      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#  1  22.8     4 108      93  3.85  2.32  18.6     1     1     4     1
#  2  30.4     4  75.7    52  4.93  1.62  18.5     1     1     4     2
#  3  21.4     4 121     109  4.11  2.78  18.6     1     1     4     2
#  4  26       4 120.     91  4.43  2.14  16.7     0     1     5     2
#  5  24.4     4 147.     62  3.69  3.19  20       1     0     4     2
#  6  32.4     4  78.7    66  4.08  2.2   19.5     1     1     4     1
#  7  21.5     4 120.     97  3.7   2.46  20.0     1     0     3     1
#  8  30.4     4  95.1   113  3.77  1.51  16.9     1     1     5     2
#  9  27.3     4  79      66  4.08  1.94  18.9     1     1     4     1
# 10  33.9     4  71.1    65  4.22  1.84  19.9     1     1     4     1
# # ... with 17 more rows
  • Related