Home > Mobile >  How to randomise order of group within group in R/dplyr?
How to randomise order of group within group in R/dplyr?

Time:04-27

I have a group nested within another group in my data. I would like to randomise the order of the nested groups while preserving the order of the rows within each nested group. (This will be a step within an existing pipe, so a tidyverse solution would be ideal.)

In the example below, how do I randomise the order of block within participant_id, while also preserving the order of both participant_id and trial?

library(dplyr)
set.seed(123)

# dummy data
data <- tibble::tribble(
          ~participant_id, ~block, ~trial,
                       1L,    "a",     1L,
                       1L,    "a",     2L,
                       1L,    "a",     3L,
                       1L,    "b",     1L,
                       1L,    "b",     2L,
                       1L,    "b",     3L,
                       2L,    "a",     1L,
                       2L,    "a",     2L,
                       2L,    "a",     3L,
                       2L,    "b",     1L,
                       2L,    "b",     2L,
                       2L,    "b",     3L
          )


# something along the lines of...

new_data <- data %>% 
  group_by(participant_id) %>%
  # ? step here to randomise order within 'block', while preserving order within 'trial'.  

Thanks.

CodePudding user response:

And here's another:

# Randomise within one participant
randomiseGroup <- function(.x, .y) {
  # Generalise to that any number of blocks can be handled
  r <- .x %>% 
    distinct(block) %>% 
    mutate(random=runif(nrow(.)))
  # Randomise
  .y %>% 
    bind_cols(
      .x %>% 
        ungroup() %>% 
        left_join(r, by="block") %>% 
        arrange(random, trial) %>% 
        select(-random)
    )
}

# Randomise all participants
data %>% 
  group_by(participant_id) %>% 
  group_map(randomiseGroup) %>% 
  bind_rows()
# A tibble: 12 × 3
   participant_id block trial
            <int> <chr> <int>
 1              1 a         1
 2              1 a         2
 3              1 a         3
 4              1 b         1
 5              1 b         2
 6              1 b         3
 7              2 b         1
 8              2 b         2
 9              2 b         3
10              2 a         1
11              2 a         2
12              2 a         3

CodePudding user response:

One option could be:

data %>%
    group_by(participant_id) %>%
    mutate(rleid = cumsum(block != lag(block, default = first(block))),
           block_random = sample(n())) %>%
    group_by(participant_id, rleid) %>%
    mutate(block_random = min(block_random)) %>%
    ungroup()

   participant_id block trial rleid block_random
            <int> <chr> <int> <int>        <int>
 1              1 a         1     0            2
 2              1 a         2     0            2
 3              1 a         3     0            2
 4              1 b         1     1            1
 5              1 b         2     1            1
 6              1 b         3     1            1
 7              2 a         1     0            2
 8              2 a         2     0            2
 9              2 a         3     0            2
10              2 b         1     1            1
11              2 b         2     1            1
12              2 b         3     1            1
  • Related