Home > database >  Randomly sampling data by several groups with conditions?
Randomly sampling data by several groups with conditions?

Time:09-14

I have a data frame with Likert responses to 7 question types about 10 voices. Here's a sample of my data:

> dput(cor1)
structure(list(ID = structure(c(49L, 78L, 81L, 72L, 60L, 95L, 
35L, 16L, 44L, 89L, 96L, 24L, 48L, 91L, 94L, 36L, 57L, 75L, 17L, 
59L, 73L, 24L, 55L, 64L, 20L, 41L, 27L, 75L, 32L, 28L, 73L, 82L, 
61L, 52L, 89L, 7L, 84L, 81L, 39L, 98L, 81L, 36L, 26L, 92L, 50L, 
77L, 25L, 54L, 59L, 38L, 16L, 63L, 77L, 27L, 99L, 39L, 4L, 3L, 
86L, 40L, 52L, 13L, 1L, 12L, 96L, 3L, 76L, 31L, 100L, 35L, 18L, 
86L, 42L, 53L, 29L, 17L, 37L, 26L, 27L, 97L, 88L, 74L, 11L, 91L, 
69L, 45L, 8L, 55L, 37L, 60L, 58L, 68L, 90L, 78L, 81L, 36L, 47L, 
90L, 42L, 16L, 71L, 66L, 39L, 45L, 99L, 25L, 64L, 42L, 1L, 58L, 
50L, 82L, 44L, 78L, 25L, 98L, 13L, 58L, 69L, 96L, 44L, 83L, 46L, 
99L, 67L, 90L, 72L, 45L, 69L, 49L, 100L, 65L, 34L, 74L, 28L, 
94L, 80L, 45L, 30L, 51L, 54L, 30L, 85L, 94L, 4L, 5L, 9L, 21L, 
28L, 66L, 57L, 91L, 10L, 72L, 10L, 7L, 98L, 1L, 38L, 89L, 59L, 
72L, 65L, 61L, 86L, 19L, 9L, 76L, 50L, 53L, 19L, 43L, 46L, 74L, 
94L, 97L, 39L, 86L, 93L, 25L, 95L, 77L, 28L, 75L, 77L, 67L, 7L, 
50L, 36L, 95L, 97L, 15L, 42L, 10L, 38L, 33L, 90L, 32L, 8L, 63L, 
92L, 16L, 18L, 18L, 50L, 51L, 71L, 30L, 29L, 84L, 3L, 20L, 67L, 
89L, 65L, 3L, 14L, 59L, 36L, 47L, 80L, 61L, 90L, 19L, 87L, 38L, 
100L, 82L, 43L, 27L, 9L, 20L, 62L, 99L, 54L, 78L, 62L, 1L, 81L, 
44L, 93L, 89L, 85L, 82L, 34L, 95L, 36L, 6L, 83L, 81L, 12L, 50L, 
63L, 11L, 42L, 10L, 81L, 52L, 22L, 93L, 6L, 29L, 34L, 2L, 48L, 
2L, 10L, 16L, 36L, 48L, 9L, 80L, 29L, 90L, 99L, 22L, 61L, 61L, 
63L, 43L, 93L, 36L, 5L, 19L, 37L, 51L, 42L, 97L, 38L, 12L, 48L, 
58L, 84L, 38L, 75L, 20L, 98L, 61L, 2L, 38L, 89L, 31L, 82L, 15L, 
93L, 32L, 66L, 90L, 60L, 15L, 45L, 27L, 85L, 52L, 18L, 64L, 74L, 
30L, 49L, 3L, 48L, 6L, 51L, 9L, 7L, 80L, 2L, 80L, 92L, 48L, 11L, 
47L, 70L, 99L, 36L, 6L, 15L, 53L, 8L, 39L, 2L, 56L, 8L, 30L, 
58L, 99L, 50L, 45L, 10L, 75L, 68L, 87L, 21L, 16L, 18L, 96L, 18L, 
77L, 64L, 15L, 11L, 97L, 85L, 66L, 91L, 13L, 67L, 53L, 85L, 50L, 
6L, 89L, 35L, 32L, 65L, 38L, 16L, 61L, 88L, 35L, 20L, 99L, 87L, 
92L, 49L, 80L, 42L, 54L, 99L, 9L, 56L, 57L, 83L, 77L, 7L, 25L, 
46L, 11L, 39L, 37L, 80L, 6L, 59L, 2L, 83L, 88L, 84L, 23L, 25L, 
18L, 41L, 47L, 39L, 86L, 77L, 100L, 90L, 84L, 44L, 48L, 42L, 
84L, 92L, 31L, 20L, 96L, 46L, 36L, 41L, 82L, 24L, 98L, 63L, 96L, 
54L, 27L, 42L, 29L, 50L, 64L, 36L, 89L, 13L, 80L, 32L, 45L, 16L, 
94L, 25L, 72L, 43L, 85L, 73L, 76L, 8L, 77L, 74L, 60L, 80L, 34L, 
38L, 7L, 43L, 42L, 66L, 95L, 86L, 100L, 4L, 99L, 16L, 90L, 69L, 
6L, 56L, 17L, 33L, 46L, 73L, 6L, 39L, 98L, 50L, 4L, 63L, 98L, 
98L, 30L, 12L, 24L, 86L, 62L, 6L, 38L, 13L, 25L, 11L, 62L, 95L, 
3L, 87L, 63L, 70L, 46L, 67L, 27L, 20L, 38L, 55L, 26L, 16L, 4L, 
87L, 23L, 17L, 8L, 21L, 34L, 98L, 56L, 21L, 33L, 82L, 32L, 40L, 
18L, 56L, 16L, 98L, 9L, 55L, 25L, 99L, 14L, 99L, 15L, 58L, 18L, 
52L, 30L, 21L, 37L, 49L, 52L, 18L, 83L, 91L, 12L, 89L, 13L, 63L, 
15L, 62L, 54L, 45L, 56L, 93L, 65L, 15L, 94L, 56L, 42L, 44L, 55L, 
100L, 7L, 42L, 17L, 3L, 34L, 16L, 29L, 69L, 88L, 31L, 64L, 55L, 
96L, 27L, 51L, 70L, 47L, 83L, 50L, 87L, 86L, 39L, 46L, 44L, 47L, 
52L, 78L, 26L, 53L, 24L, 43L, 24L, 38L, 94L, 96L, 55L, 54L, 91L, 
6L, 16L, 44L, 48L, 80L, 27L, 33L, 25L, 66L, 45L, 99L, 53L, 12L, 
49L, 46L, 33L, 56L, 88L, 29L, 24L, 88L, 62L, 82L, 61L, 100L, 
77L, 89L, 65L, 18L, 24L, 96L, 41L, 70L, 20L, 75L, 81L, 50L, 47L, 
93L, 39L, 16L, 4L, 65L, 32L, 57L, 10L, 12L, 69L, 43L, 81L, 97L, 
74L, 9L, 87L, 33L, 60L, 73L, 14L, 17L, 56L, 27L, 41L, 35L, 64L, 
19L, 92L, 42L, 5L, 58L, 19L, 17L, 31L, 50L, 78L, 10L, 62L, 56L, 
62L, 54L, 15L, 25L, 66L, 4L, 98L, 89L, 26L, 79L, 96L, 9L, 60L, 
11L, 71L, 8L, 79L, 39L, 94L, 6L, 13L, 90L, 38L, 38L, 54L, 84L, 
54L, 92L, 68L, 41L, 7L, 61L, 71L, 98L, 22L, 47L, 24L, 25L, 25L, 
69L, 19L, 99L, 24L, 19L, 51L, 91L, 56L, 82L, 35L, 93L, 15L, 55L, 
87L, 19L, 84L, 22L, 100L, 45L, 43L, 90L, 40L, 64L, 35L, 34L, 
85L, 97L, 21L, 16L, 7L, 49L, 55L, 26L, 62L, 80L, 69L, 52L, 74L, 
8L, 52L, 12L, 18L, 80L, 63L, 79L, 89L, 35L, 9L, 22L, 61L, 37L, 
73L, 91L, 68L, 68L, 80L, 20L, 70L, 27L, 76L, 13L, 5L, 74L, 69L, 
29L, 68L, 91L, 16L, 3L, 19L, 20L, 31L, 60L, 77L, 53L, 48L, 5L, 
15L, 47L, 58L, 2L, 51L, 27L, 23L, 50L, 97L, 42L, 53L, 47L, 13L, 
100L, 12L, 11L, 2L, 86L, 58L, 9L, 23L, 28L, 65L, 77L, 8L, 45L, 
38L, 6L, 64L, 89L, 78L, 18L, 88L, 73L, 1L, 94L, 27L, 97L, 33L, 
75L, 5L, 56L, 48L, 61L, 39L, 76L, 100L, 45L, 87L, 54L, 76L, 8L, 
59L, 64L, 96L, 45L, 74L, 39L, 33L, 33L, 17L, 28L, 57L, 24L, 75L, 
38L, 80L, 94L, 3L, 34L, 7L, 90L, 67L, 45L, 97L, 30L, 55L, 88L, 
15L, 2L, 32L, 86L, 62L, 76L, 24L, 68L, 37L, 3L, 37L, 92L, 12L, 
75L, 10L, 73L, 44L, 32L, 56L, 99L, 21L, 6L, 76L, 42L, 81L, 40L, 
35L, 11L, 11L, 76L, 34L, 38L, 96L, 66L, 74L, 19L, 79L, 28L, 13L, 
73L, 15L, 48L, 64L, 58L, 90L, 32L, 24L, 95L, 15L, 90L, 35L, 100L, 
61L, 8L, 4L, 9L, 91L, 45L, 78L, 84L, 1L, 54L, 93L, 76L, 20L, 
11L, 74L, 81L, 27L, 49L, 48L, 18L, 21L, 48L, 51L, 31L, 65L, 18L, 
84L, 99L, 76L, 79L, 1L, 71L, 37L, 99L, 23L, 35L, 3L, 41L, 83L, 
89L, 4L, 51L, 88L, 85L, 63L, 59L, 8L, 50L, 2L, 69L, 3L, 64L, 
24L, 68L, 93L, 83L, 22L, 82L, 39L, 98L, 12L, 93L, 30L, 59L, 17L, 
88L, 41L, 28L, 55L, 39L, 54L, 5L, 71L, 24L), levels = c("3076960", 
"3077063", "3077134", "3077775", "3079376", "3117052", "3117053", 
"3117055", "3117058", "3117071", "3117072", "3117084", "3117092", 
"3117115", "3117158", "3117196", "3117249", "3118375", "3123348", 
"3123358", "3123364", "3123379", "3123387", "3123396", "3123397", 
"3123402", "3123403", "3123423", "3123433", "3123444", "3123499", 
"3123543", "3123713", "3123837", "3187747", "3187753", "3187986", 
"3206823", "3206826", "3206827", "3206828", "3206829", "3206830", 
"3206838", "3206844", "3207921", "3218018", "3218393", "3076980", 
"3077130", "3077181", "3077221", "3080756", "3117057", "3117059", 
"3117061", "3117065", "3117068", "3117073", "3117078", "3117088", 
"3117090", "3117162", "3117170", "3117175", "3117327", "3117922", 
"3123344", "3123345", "3123346", "3123354", "3123361", "3123367", 
"3123381", "3123388", "3123392", "3123404", "3123406", "3123411", 
"3123418", "3123427", "3123446", "3123474", "3123666", "3123757", 
"3187745", "3187918", "3189130", "3206813", "3206831", "3206833", 
"3206834", "3206837", "3206839", "3206848", "3218015", "3218023", 
"3218024", "3249231", "3250005"), class = "factor"), stimulus.accent = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L), levels = c("SSBE", 
"Belfast", "Birmingham", "Bradford", "Bristol", "Cardiff", "Glasgow", 
"Liverpool", "London", "Newcastle"), class = "factor"), question.type = structure(c(5L, 
6L, 4L, 7L, 6L, 7L, 2L, 3L, 1L, 6L, 4L, 1L, 2L, 5L, 4L, 3L, 5L, 
6L, 1L, 6L, 4L, 1L, 5L, 7L, 2L, 2L, 1L, 4L, 2L, 1L, 7L, 5L, 4L, 
5L, 4L, 3L, 4L, 4L, 1L, 7L, 7L, 3L, 3L, 4L, 6L, 5L, 3L, 7L, 7L, 
1L, 2L, 6L, 4L, 1L, 5L, 2L, 2L, 1L, 5L, 1L, 4L, 2L, 2L, 3L, 4L, 
2L, 4L, 2L, 4L, 3L, 2L, 4L, 1L, 4L, 1L, 3L, 3L, 3L, 2L, 4L, 4L, 
4L, 2L, 4L, 6L, 1L, 2L, 7L, 2L, 4L, 4L, 4L, 6L, 5L, 6L, 2L, 2L, 
7L, 1L, 2L, 4L, 7L, 2L, 2L, 4L, 2L, 4L, 3L, 1L, 4L, 5L, 5L, 2L, 
6L, 1L, 4L, 3L, 6L, 4L, 5L, 2L, 4L, 2L, 6L, 7L, 5L, 4L, 2L, 7L, 
5L, 5L, 4L, 1L, 7L, 1L, 4L, 6L, 3L, 2L, 5L, 4L, 1L, 7L, 4L, 1L, 
1L, 2L, 2L, 2L, 4L, 4L, 4L, 2L, 7L, 2L, 3L, 5L, 2L, 1L, 5L, 4L, 
5L, 6L, 4L, 4L, 3L, 1L, 4L, 4L, 4L, 2L, 2L, 1L, 6L, 4L, 5L, 3L, 
4L, 7L, 1L, 7L, 5L, 1L, 5L, 4L, 5L, 2L, 4L, 1L, 5L, 4L, 1L, 2L, 
3L, 1L, 2L, 4L, 1L, 3L, 4L, 7L, 1L, 1L, 2L, 5L, 7L, 4L, 1L, 1L, 
4L, 2L, 3L, 4L, 4L, 5L, 3L, 2L, 7L, 1L, 1L, 4L, 5L, 7L, 1L, 7L, 
2L, 4L, 7L, 1L, 3L, 2L, 1L, 4L, 4L, 4L, 4L, 4L, 1L, 4L, 3L, 6L, 
4L, 4L, 6L, 2L, 5L, 1L, 2L, 6L, 5L, 3L, 4L, 4L, 2L, 1L, 2L, 4L, 
6L, 3L, 4L, 1L, 3L, 2L, 2L, 3L, 2L, 2L, 2L, 3L, 1L, 1L, 4L, 3L, 
7L, 6L, 1L, 5L, 6L, 4L, 1L, 4L, 2L, 2L, 1L, 2L, 5L, 3L, 4L, 1L, 
1L, 2L, 6L, 4L, 3L, 7L, 2L, 5L, 7L, 2L, 1L, 5L, 1L, 7L, 1L, 7L, 
2L, 4L, 4L, 4L, 3L, 3L, 1L, 4L, 6L, 2L, 4L, 6L, 1L, 7L, 2L, 1L, 
1L, 4L, 2L, 1L, 7L, 3L, 5L, 6L, 2L, 1L, 2L, 4L, 4L, 1L, 2L, 2L, 
4L, 1L, 1L, 1L, 6L, 2L, 3L, 5L, 5L, 6L, 1L, 1L, 7L, 6L, 7L, 2L, 
3L, 1L, 4L, 3L, 5L, 7L, 1L, 1L, 5L, 5L, 7L, 4L, 1L, 4L, 5L, 4L, 
7L, 1L, 4L, 2L, 1L, 4L, 3L, 2L, 5L, 5L, 2L, 3L, 4L, 5L, 4L, 5L, 
5L, 1L, 5L, 7L, 1L, 4L, 4L, 7L, 4L, 2L, 1L, 2L, 1L, 2L, 1L, 6L, 
1L, 5L, 3L, 5L, 4L, 4L, 3L, 2L, 3L, 3L, 2L, 3L, 4L, 6L, 4L, 4L, 
5L, 2L, 3L, 1L, 7L, 5L, 1L, 2L, 4L, 2L, 1L, 2L, 4L, 3L, 4L, 5L, 
6L, 7L, 1L, 1L, 2L, 5L, 4L, 2L, 5L, 3L, 7L, 1L, 2L, 1L, 4L, 1L, 
7L, 2L, 4L, 4L, 5L, 1L, 5L, 7L, 4L, 4L, 2L, 2L, 1L, 1L, 3L, 7L, 
5L, 5L, 7L, 3L, 5L, 1L, 4L, 7L, 2L, 6L, 1L, 1L, 1L, 5L, 2L, 2L, 
4L, 7L, 1L, 6L, 7L, 4L, 2L, 3L, 2L, 4L, 4L, 2L, 1L, 2L, 3L, 1L, 
4L, 7L, 2L, 7L, 7L, 6L, 2L, 4L, 1L, 3L, 1L, 7L, 1L, 2L, 3L, 6L, 
2L, 3L, 2L, 1L, 1L, 4L, 4L, 1L, 2L, 7L, 1L, 1L, 2L, 7L, 1L, 6L, 
2L, 7L, 2L, 4L, 1L, 4L, 1L, 5L, 2L, 4L, 1L, 3L, 1L, 4L, 4L, 1L, 
7L, 6L, 1L, 4L, 1L, 4L, 2L, 7L, 6L, 1L, 4L, 4L, 4L, 3L, 4L, 7L, 
2L, 2L, 4L, 4L, 2L, 3L, 1L, 1L, 1L, 3L, 2L, 5L, 4L, 3L, 4L, 4L, 
5L, 2L, 5L, 4L, 2L, 6L, 4L, 4L, 4L, 2L, 2L, 1L, 1L, 5L, 4L, 1L, 
5L, 2L, 1L, 2L, 1L, 4L, 4L, 5L, 4L, 7L, 2L, 2L, 1L, 1L, 4L, 1L, 
2L, 2L, 4L, 3L, 7L, 5L, 2L, 4L, 2L, 1L, 7L, 4L, 1L, 1L, 4L, 5L, 
5L, 4L, 5L, 7L, 4L, 7L, 2L, 2L, 4L, 2L, 7L, 1L, 7L, 7L, 4L, 2L, 
4L, 3L, 1L, 2L, 5L, 3L, 4L, 3L, 3L, 6L, 1L, 6L, 6L, 5L, 1L, 5L, 
1L, 4L, 4L, 1L, 1L, 6L, 2L, 2L, 1L, 4L, 2L, 5L, 1L, 3L, 4L, 1L, 
2L, 1L, 4L, 7L, 2L, 4L, 5L, 5L, 6L, 2L, 2L, 7L, 1L, 7L, 6L, 1L, 
5L, 7L, 3L, 4L, 2L, 4L, 1L, 6L, 3L, 7L, 1L, 1L, 7L, 1L, 1L, 6L, 
7L, 4L, 4L, 7L, 3L, 1L, 5L, 6L, 7L, 3L, 3L, 2L, 2L, 3L, 6L, 1L, 
4L, 2L, 2L, 4L, 5L, 5L, 7L, 3L, 6L, 2L, 7L, 4L, 1L, 5L, 1L, 4L, 
2L, 2L, 6L, 2L, 4L, 1L, 2L, 5L, 5L, 1L, 2L, 1L, 6L, 4L, 1L, 5L, 
5L, 4L, 4L, 6L, 2L, 4L, 2L, 2L, 4L, 7L, 4L, 4L, 3L, 1L, 2L, 4L, 
3L, 5L, 7L, 5L, 5L, 5L, 2L, 4L, 1L, 7L, 2L, 2L, 7L, 5L, 3L, 7L, 
4L, 1L, 2L, 1L, 2L, 2L, 7L, 5L, 4L, 2L, 1L, 2L, 1L, 7L, 2L, 5L, 
1L, 1L, 7L, 4L, 1L, 7L, 3L, 1L, 7L, 3L, 1L, 1L, 4L, 4L, 2L, 1L, 
2L, 5L, 4L, 3L, 1L, 1L, 2L, 4L, 5L, 4L, 2L, 7L, 4L, 1L, 4L, 3L, 
5L, 1L, 4L, 3L, 5L, 1L, 4L, 1L, 4L, 4L, 2L, 4L, 7L, 4L, 2L, 5L, 
4L, 4L, 2L, 4L, 3L, 2L, 3L, 1L, 3L, 4L, 1L, 5L, 3L, 7L, 5L, 2L, 
2L, 3L, 7L, 4L, 3L, 7L, 1L, 4L, 5L, 1L, 3L, 1L, 7L, 4L, 4L, 1L, 
7L, 2L, 1L, 3L, 4L, 1L, 4L, 1L, 4L, 1L, 1L, 5L, 4L, 3L, 2L, 4L, 
2L, 4L, 3L, 3L, 1L, 2L, 5L, 1L, 2L, 4L, 4L, 7L, 1L, 7L, 1L, 1L, 
4L, 2L, 1L, 4L, 6L, 7L, 2L, 2L, 7L, 1L, 7L, 1L, 7L, 4L, 1L, 2L, 
2L, 7L, 2L, 7L, 4L, 1L, 4L, 5L, 7L, 2L, 2L, 7L, 4L, 1L, 4L, 1L, 
1L, 1L, 2L, 5L, 3L, 4L, 3L, 4L, 4L, 4L, 5L, 2L, 4L, 3L, 5L, 1L, 
2L, 3L, 3L, 4L, 4L, 2L, 4L, 4L, 4L, 7L, 4L, 1L, 7L, 1L, 4L, 1L, 
4L, 3L, 5L, 7L, 7L, 2L, 5L, 2L, 4L, 1L, 5L, 1L, 4L, 2L, 4L, 1L, 
2L, 5L, 1L, 7L, 1L, 4L, 2L), levels = c("status", "solidarity", 
"dynamism", "crime", "morally bad", "morally ambiguous", "morally good"
), class = c("ordered", "factor")), response = structure(c(1L, 
4L, 1L, 3L, 4L, 3L, 4L, 2L, 5L, 5L, 1L, 6L, 5L, 1L, 1L, 2L, 5L, 
6L, 6L, 2L, 2L, 6L, 5L, 7L, 4L, 5L, 1L, 5L, 4L, 5L, 5L, 5L, 1L, 
2L, 2L, 6L, 2L, 1L, 5L, 6L, 2L, 7L, 2L, 1L, 6L, 5L, 3L, 5L, 6L, 
6L, 4L, 1L, 1L, 7L, 3L, 5L, 5L, 5L, 5L, 4L, 1L, 5L, 6L, 7L, 4L, 
4L, 2L, 5L, 2L, 1L, 5L, 4L, 3L, 1L, 4L, 5L, 5L, 4L, 6L, 1L, 2L, 
1L, 2L, 1L, 3L, 4L, 2L, 3L, 5L, 3L, 1L, 4L, 5L, 5L, 3L, 5L, 5L, 
4L, 7L, 4L, 2L, 7L, 6L, 4L, 1L, 6L, 1L, 5L, 3L, 1L, 7L, 3L, 6L, 
1L, 4L, 1L, 6L, 5L, 5L, 4L, 6L, 1L, 4L, 3L, 5L, 4L, 4L, 4L, 4L, 
1L, 5L, 3L, 4L, 5L, 3L, 2L, 1L, 6L, 5L, 2L, 4L, 4L, 5L, 3L, 5L, 
5L, 7L, 4L, 4L, 1L, 2L, 3L, 4L, 4L, 4L, 6L, 3L, 5L, 6L, 4L, 4L, 
3L, 3L, 1L, 3L, 3L, 1L, 2L, 2L, 1L, 2L, 6L, 5L, 1L, 1L, 2L, 6L, 
2L, 4L, 5L, 7L, 3L, 3L, 1L, 2L, 1L, 6L, 1L, 3L, 2L, 3L, 4L, 2L, 
5L, 4L, 4L, 2L, 3L, 6L, 1L, 2L, 3L, 5L, 5L, 2L, 6L, 3L, 3L, 4L, 
1L, 5L, 4L, 1L, 4L, 4L, 5L, 6L, 3L, 3L, 6L, 5L, 1L, 3L, 2L, 1L, 
5L, 4L, 4L, 3L, 1L, 6L, 3L, 3L, 2L, 4L, 1L, 3L, 4L, 3L, 4L, 5L, 
2L, 1L, 2L, 3L, 5L, 4L, 5L, 2L, 6L, 2L, 5L, 1L, 6L, 4L, 4L, 2L, 
3L, 3L, 2L, 2L, 5L, 3L, 6L, 5L, 5L, 5L, 3L, 6L, 4L, 4L, 5L, 5L, 
5L, 3L, 3L, 1L, 1L, 1L, 4L, 3L, 5L, 6L, 7L, 5L, 3L, 3L, 1L, 5L, 
7L, 4L, 2L, 1L, 3L, 7L, 5L, 3L, 1L, 6L, 2L, 4L, 6L, 5L, 4L, 2L, 
4L, 1L, 2L, 5L, 2L, 5L, 7L, 1L, 3L, 5L, 2L, 1L, 2L, 4L, 4L, 4L, 
4L, 5L, 5L, 5L, 2L, 7L, 4L, 1L, 7L, 5L, 6L, 1L, 2L, 4L, 4L, 4L, 
1L, 2L, 5L, 3L, 1L, 4L, 3L, 4L, 3L, 4L, 5L, 3L, 7L, 3L, 7L, 4L, 
2L, 3L, 4L, 3L, 5L, 4L, 4L, 5L, 5L, 3L, 1L, 2L, 5L, 1L, 4L, 3L, 
6L, 4L, 4L, 5L, 5L, 3L, 4L, 4L, 1L, 7L, 5L, 4L, 3L, 1L, 6L, 5L, 
4L, 7L, 4L, 3L, 3L, 4L, 5L, 3L, 6L, 5L, 5L, 6L, 3L, 6L, 6L, 2L, 
4L, 7L, 2L, 3L, 6L, 5L, 2L, 5L, 2L, 2L, 4L, 3L, 4L, 4L, 5L, 4L, 
5L, 4L, 5L, 3L, 1L, 5L, 5L, 5L, 2L, 4L, 4L, 5L, 2L, 5L, 3L, 4L, 
2L, 5L, 4L, 6L, 4L, 5L, 4L, 5L, 3L, 1L, 2L, 2L, 4L, 4L, 2L, 3L, 
5L, 4L, 2L, 2L, 3L, 5L, 5L, 3L, 4L, 3L, 6L, 3L, 6L, 3L, 6L, 6L, 
3L, 3L, 5L, 1L, 2L, 3L, 2L, 5L, 5L, 2L, 2L, 5L, 5L, 5L, 5L, 5L, 
2L, 3L, 5L, 1L, 4L, 2L, 5L, 1L, 4L, 2L, 4L, 4L, 3L, 5L, 3L, 5L, 
4L, 7L, 3L, 7L, 4L, 1L, 4L, 1L, 3L, 1L, 3L, 3L, 3L, 3L, 6L, 7L, 
5L, 3L, 3L, 3L, 6L, 5L, 2L, 5L, 3L, 5L, 3L, 3L, 4L, 3L, 5L, 4L, 
4L, 3L, 5L, 4L, 4L, 3L, 4L, 3L, 5L, 1L, 4L, 1L, 5L, 3L, 1L, 5L, 
4L, 2L, 4L, 3L, 4L, 4L, 4L, 3L, 4L, 5L, 3L, 2L, 3L, 4L, 5L, 4L, 
6L, 6L, 5L, 5L, 5L, 6L, 2L, 6L, 5L, 3L, 4L, 4L, 2L, 3L, 3L, 5L, 
2L, 5L, 4L, 1L, 6L, 1L, 2L, 1L, 2L, 5L, 4L, 5L, 5L, 2L, 1L, 4L, 
4L, 4L, 3L, 4L, 5L, 5L, 2L, 2L, 4L, 7L, 5L, 3L, 5L, 6L, 4L, 2L, 
2L, 6L, 1L, 6L, 3L, 3L, 6L, 1L, 5L, 4L, 3L, 6L, 5L, 4L, 6L, 4L, 
1L, 1L, 3L, 5L, 3L, 4L, 5L, 5L, 5L, 6L, 5L, 4L, 6L, 5L, 5L, 5L, 
5L, 1L, 5L, 5L, 5L, 2L, 5L, 4L, 3L, 2L, 2L, 3L, 4L, 5L, 2L, 4L, 
5L, 2L, 2L, 6L, 3L, 3L, 4L, 6L, 5L, 2L, 4L, 2L, 6L, 1L, 5L, 5L, 
5L, 3L, 2L, 3L, 5L, 3L, 3L, 4L, 4L, 4L, 5L, 7L, 5L, 4L, 4L, 7L, 
2L, 5L, 7L, 4L, 2L, 2L, 5L, 2L, 5L, 5L, 4L, 4L, 6L, 6L, 3L, 4L, 
3L, 5L, 1L, 4L, 5L, 4L, 1L, 5L, 2L, 3L, 4L, 5L, 4L, 5L, 2L, 2L, 
4L, 4L, 2L, 3L, 1L, 4L, 5L, 5L, 1L, 5L, 2L, 1L, 1L, 5L, 4L, 6L, 
5L, 3L, 3L, 2L, 3L, 4L, 3L, 2L, 6L, 3L, 2L, 5L, 1L, 6L, 7L, 2L, 
3L, 4L, 1L, 1L, 5L, 1L, 5L, 4L, 2L, 4L, 6L, 3L, 1L, 5L, 4L, 2L, 
3L, 7L, 5L, 6L, 3L, 4L, 4L, 1L, 7L, 6L, 5L, 5L, 5L, 4L, 5L, 5L, 
3L, 3L, 4L, 2L, 4L, 2L, 6L, 5L, 1L, 4L, 4L, 4L, 6L, 5L, 6L, 7L, 
5L, 2L, 6L, 3L, 3L, 5L, 5L, 5L, 5L, 2L, 3L, 5L, 2L, 1L, 3L, 2L, 
2L, 3L, 6L, 5L, 5L, 5L, 5L, 6L, 5L, 3L, 4L, 4L, 1L, 3L, 1L, 2L, 
3L, 3L, 2L, 3L, 4L, 2L, 2L, 6L, 3L, 4L, 5L, 1L, 5L, 3L, 4L, 7L, 
2L, 2L, 4L, 2L, 1L, 4L, 5L, 2L, 3L, 5L, 2L, 2L, 5L, 3L, 6L, 4L, 
3L, 4L, 3L, 1L, 2L, 4L, 3L, 5L, 4L, 6L, 2L, 2L, 5L, 4L, 5L, 2L, 
5L, 5L, 3L, 3L, 1L, 4L, 2L, 3L, 1L, 2L, 2L, 5L, 2L, 2L, 4L, 4L, 
6L, 3L, 1L, 1L, 5L, 2L, 2L, 2L, 3L, 4L, 2L, 3L, 6L, 6L, 5L, 5L, 
3L, 4L, 6L, 7L, 2L, 2L, 5L, 3L, 3L, 4L, 5L, 4L, 5L, 1L, 2L, 4L, 
6L, 5L, 4L, 6L, 2L, 3L, 4L, 3L, 6L, 5L, 2L, 3L, 2L, 4L, 5L, 2L, 
4L, 7L, 5L, 6L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 4L, 4L, 4L, 4L, 2L, 
5L, 5L, 2L, 2L, 4L, 4L, 5L, 4L, 1L, 4L, 2L, 6L, 1L, 3L, 4L, 3L, 
6L, 4L, 5L, 5L, 5L, 6L, 5L, 5L, 5L, 3L, 3L, 6L, 1L, 2L, 5L, 6L, 
3L, 5L, 5L, 4L, 3L, 2L, 3L), levels = c("1", "2", "3", "4", "5", 
"6", "7"), class = c("ordered", "factor"))), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -1000L), groups = structure(list(
    stimulus.accent = structure(1:10, levels = c("SSBE", "Belfast", 
    "Birmingham", "Bradford", "Bristol", "Cardiff", "Glasgow", 
    "Liverpool", "London", "Newcastle"), class = "factor"), .rows = structure(list(
        1:100, 101:200, 201:300, 301:400, 401:500, 501:600, 601:700, 
        701:800, 801:900, 901:1000), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE))

There are 100 participants, 10 voices, 7 question types. Each participant heard every voice, but did not answer every question type. Additionally there are an unequal number of responses for each participant for each question type, so there are an unequal number of responses for each question type for each voice.

I want to randomly sample the data so that there are an equal number of responses per question type for each voice. Currently the minimum number of responses per voice & per question type is 50, so that's the number I'd like for each question type for each voice. This would mean instead of a 10,000 rows I would have 3500. But crucially I want to be sure that each participant (ID) is only sampled once for each question type (question.type) for each voice (stimulus.accent).

I've tried using group_by() and sample_n() from dplyr, but can't work out how to set different target sampling for different groupings.

LMK if I've done any of this wrong, and thanks in advance!

CodePudding user response:

Currently the minimum number of responses per voice & per question type is 50

It looks to me like this isn't the case, unless I've misunderstood what you've put, e.g.:

count(cor1, stimulus.accent, question.type)

shows there is at least one group of accent/question type with only 1 row.

That aside, it looks like this can be solved with slice_sample (which has superseded sample_n). If you want to ensure there is only one row per ID, you can simply sample a single row for each ID, accent and question type first.

# 50 random rows per accent and question type
cor1 %>%
  group_by(stimulus.accent, question.type) %>%
  slice_sample(n = 50)

# Or, first ensure maximum of one row per ID, per accent and question type
cor1 %>%
  group_by(ID, stimulus.accent, question.type) %>%
  slice_sample(n = 1) %>%
  group_by(stimulus.accent, question.type) %>%
  slice_sample(n = 50)

CodePudding user response:

Here is a way with package sampling.
The main function is strata, that stratifies the sampling by variables passed in strataname. I define the strata samples sizes to be equal to 1 only because stratum 69 only has 1 element.

library(sampling)

tbl <- table(cor1[2:3])
size <- rep(1, length(tbl[tbl != 0]))

s <- strata(cor1, stratanames = c("stimulus.accent", "question.type"),
            size = size, method = "srswor")
cor1_sample <- getdata(cor1, s)

str(cor1_sample)
#> 'data.frame':    69 obs. of  7 variables:
#>  $ ID             : Factor w/ 100 levels "3076960","3077063",..: 57 60 61 72 41 25 44 65 95 43 ...
#>  $ response       : Ord.factor w/ 7 levels "1"<"2"<"3"<"4"<..: 5 4 1 3 5 3 5 3 7 6 ...
#>  $ stimulus.accent: Factor w/ 10 levels "SSBE","Belfast",..: 1 1 1 1 1 1 1 2 2 2 ...
#>  $ question.type  : Ord.factor w/ 7 levels "status"<"solidarity"<..: 5 6 4 7 2 3 1 4 7 2 ...
#>  $ ID_unit        : int  17 5 33 4 26 47 9 132 181 172 ...
#>  $ Prob           : num  0.1 0.1 0.0417 0.1 0.05 ...
#>  $ Stratum        : int  1 2 3 4 5 6 7 8 9 10 ...

Created on 2022-09-13 with reprex v2.0.2

  • Related