I have a dataframe(df) set up in a way that each 3 rows is a biological triplicate.
Firstly, for every 3 rows, I'd like to randomly select 1 row out of the 3, take it out of df and put it in df_test.
CodePudding user response:
library(dplyr)
df_test <- df %>%
group_by(grp = (row_number()-1) %/% 3) %>%
slice_sample(n = 1) %>%
ungroup()
CodePudding user response:
You should be able to sample
all at once. If each group is a block of n
rows, sample randomly an offset of 0:(n-1)
from the start of each block, and add it to the start of each block - seq(1, nrow(df), n)
.
s <- seq(1, nrow(df), n)
df[sample(0:(n-1), length(s)) s,]
Try it with 1000 runs and the distribution of rows selected seems pretty uniform:
set.seed(1)
df <- data.frame(matrix(1:18, ncol=2))
s <- seq(1, nrow(df), n)
table(replicate(1000, sample(0:(n-1), length(s)) s))
# 1 2 3 4 5 6 7 8 9
#341 329 330 325 344 331 334 327 339