Home > Enterprise >  Sampling within group with varied numbers in R
Sampling within group with varied numbers in R

Time:06-28

Suppose I have a data frame df:

set.seed(123)
n1  <- 5
n2  <- 8
DVm <- rnorm(n1, 180, 10)
DVf <- rnorm(n2, 175, 6)
df <- data.frame(DV=c(DVm, DVf),
                   IV=factor(rep(c("m", "f"), c(n1, n2))))
df
         DV IV
1  174.3952  m
2  177.6982  m
3  195.5871  m
4  180.7051  m
5  181.2929  m
6  185.2904  f
7  177.7655  f
8  167.4096  f
9  170.8789  f
10 172.3260  f
11 182.3445  f
12 177.1589  f
13 177.4046  f

What I wanted is to create a new data frame by sampling n1 new DV with replacement for IV=="m" and n2 new DV with replacement for IV=="f" so that the new data frame will have same dimensions and has sampled within each group of m and f. Is there a single function for it?

CodePudding user response:

We can use slice_sample within group_modify

library(dplyr)
df %>% 
  group_by(IV) %>%
  group_modify(~ .x %>%
     slice_sample( n= nrow(.), replace = TRUE)) %>%
  ungroup

-output

# A tibble: 13 × 2
   IV       DV
   <fct> <dbl>
 1 f      177.
 2 f      182.
 3 f      185.
 4 f      178.
 5 f      177.
 6 f      171.
 7 f      172.
 8 f      167.
 9 m      181.
10 m      178.
11 m      174.
12 m      196.
13 m      181.
  •  Tags:  
  • r
  • Related