This is a general question for the slice_sample process. From my original database I am doing sthg like this
df<-dat_longer %>% dplyr::select(grupo_int_v00, time, peso1 ,cintura1, hdl) %>%
group_by(grupo_int_v00) %>%
slice_sample(n = 20,replace=TRUE) %>% ungroup() %>% dput()
Therefore, I am getting this code:
df<-structure(list(grupo_int_v00 = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L), .Label = c("A", "B"), label = "Grupo de intervención", class = "factor"),
time = c(0, 0, 2, 0, 2, 1, 1, 2, 2, 1, 1, 0, 2, 1, 2, 0,
1, 2, 1, 0, 0, 2, 2, 1, 0, 2, 2, 1, 0, 2, 1, 0, 1, 0, 1,
2, 1, 0, 0, 0), peso1 = c(100.7, 93, 84.5, 110.2, 76.4, 90.7,
93.6, 90.2, 84.8, 82.1, 125.3, 80.2, 76, 64.5, 86.9, 99,
83.9, 96.1, 91.6, 89.9, 93.4, 98.8, 70, 67.7, 110.3, 75,
87.2, 97.9, 82.7, 69.5, 81.2, 98, 73.8, 91.2, 87, 95, 76.6,
103.2, 103.4, 60), cintura1 = c(116.5, 112, 107, 127, NA,
106, 98.5, 124, 103.5, 107, 133.5, 104.5, 104.5, 97, 104.5,
107, 116, 110, 109, 113, 107, 105, 98, 101, 132, NA, 96.5,
118, 110, 85, 106.5, 123, 108, 107.5, 112, 117, 97.5, 114,
119, 94), hdl = c(56, 47, 61, 54, NA, 80, 61, 76, 50, 71,
64, 47, 59, 61, 59, 49, 49, 68, 71, 59, 55, 43, 52, 53, 42,
NA, 40, 40, 58, 60, 53, 62, 56, 48, 58, 39, 54, 63, 45, 45
)), row.names = c(NA, -40L), class = c("tbl_df", "tbl", "data.frame"
))
This code is made up 40 rows. But I am specifying 20 as n. I have gone through the arguments function but I don't really understand what is going on
Thanks in advance
CodePudding user response:
This is because you use group_by
which means it will return per group 20 samples. Here is an example using iris dataset:
iris %>%
group_by(Species) %>%
slice_sample(n = 5)
Output:
# A tibble: 15 × 5
# Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 4.8 3.4 1.9 0.2 setosa
2 5 3.3 1.4 0.2 setosa
3 5.2 3.5 1.5 0.2 setosa
4 4.5 2.3 1.3 0.3 setosa
5 5.1 3.8 1.5 0.3 setosa
6 5.6 3 4.5 1.5 versicolor
7 6.5 2.8 4.6 1.5 versicolor
8 5.8 2.6 4 1.2 versicolor
9 5.5 2.4 3.7 1 versicolor
10 6.4 3.2 4.5 1.5 versicolor
11 6.7 3.3 5.7 2.1 virginica
12 6.7 3 5.2 2.3 virginica
13 5.7 2.5 5 2 virginica
14 5.8 2.8 5.1 2.4 virginica
15 7.2 3.2 6 1.8 virginica
When using no group_by
:
iris %>%
slice_sample(n = 5)
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.2 3.4 1.4 0.2 setosa
2 6.6 2.9 4.6 1.3 versicolor
3 7.2 3.6 6.1 2.5 virginica
4 5.5 3.5 1.3 0.2 setosa
5 4.7 3.2 1.6 0.2 setosa
It returns 5 samples.