I'm trying to learn how to use nest(), and I'm trying to nest by once of 3 time periods participants could be in and I want to add two columns. The first column is the overall mean, which I have figured out. Then, I want to nest by the time variable and create 3 datasets (which I have figured out) and then compute the group mean. I read that you should create a function (here, section 6.3.1), but my function keeps failing. How would I do this?
Also, please use nest or nest_by in the solution. I know I could use group_by(), like someone else did here, but in my actual data, I need these to be 3 separate datasets due to other computations that I need to do.
#Here's my setup and sample data
library(dplyr)
library(purrr)
library(tidyr)
set.seed(1414)
test <- tibble(id = c(1:100),
condition = c(rep(c("pre", "post"), 50)),
time = c(case_when(condition == "pre" ~ 0,
condition == "post" ~ sample(c(1, 2), size = c(100), replace = TRUE))),
score = case_when(time == 0 ~ 1,
time == 1 ~ 10,
time == 2 ~ 100))
#Here's what I tried
#Nesting the data (works)
nested_test <- test %>%
unite(col = "all_combos", c(condition, time)) %>%
mutate(score2 = mean(score)) %>%
nest_by(all_combos)
#Make mean function and map it (doesn't work)
my_mean <- function(data) {
mean(score, na.rm = T)
}
nested_test %>%
mutate(score3 = map(data, my_mean))
CodePudding user response:
We may need to ungroup
as there is rowwise
attribute and then loop over the data
with map
and create the column with mutate
on the nested data
library(dplyr)
library(purrr)
nested_test_new <- nested_test %>%
ungroup %>%
mutate(data = map(data, ~ .x %>%
mutate(score3 = mean(score, na.rm = TRUE))))
-output
nested_test_new
# A tibble: 3 × 2
all_combos data
<chr> <list>
1 post_1 <tibble [19 × 4]>
2 post_2 <tibble [31 × 4]>
3 pre_0 <tibble [50 × 4]>
> nested_test_new$data
[[1]]
# A tibble: 19 × 4
id score score2 score3
<int> <dbl> <dbl> <dbl>
1 2 10 33.4 10
2 4 10 33.4 10
3 14 10 33.4 10
4 16 10 33.4 10
5 18 10 33.4 10
6 28 10 33.4 10
7 30 10 33.4 10
8 32 10 33.4 10
9 38 10 33.4 10
10 44 10 33.4 10
11 48 10 33.4 10
12 60 10 33.4 10
13 64 10 33.4 10
14 78 10 33.4 10
15 80 10 33.4 10
16 86 10 33.4 10
17 92 10 33.4 10
18 96 10 33.4 10
19 100 10 33.4 10
[[2]]
# A tibble: 31 × 4
id score score2 score3
<int> <dbl> <dbl> <dbl>
1 6 100 33.4 100
2 8 100 33.4 100
3 10 100 33.4 100
4 12 100 33.4 100
...
Or another option is nest_mutate
from nplyr
library(nplyr)
test %>%
unite(col = "all_combos", c(condition, time)) %>%
mutate(score2 = mean(score)) %>%
nest(data = -all_combos) %>%
nest_mutate(data, score3 = mean(score, na.rm = TRUE))
-output
# A tibble: 3 × 2
all_combos data
<chr> <list>
1 pre_0 <tibble [50 × 4]>
2 post_1 <tibble [19 × 4]>
3 post_2 <tibble [31 × 4]>