Home > Net >  How to mutate to input the mean as a column after using nest?
How to mutate to input the mean as a column after using nest?

Time:01-30

I'm trying to learn how to use nest(), and I'm trying to nest by once of 3 time periods participants could be in and I want to add two columns. The first column is the overall mean, which I have figured out. Then, I want to nest by the time variable and create 3 datasets (which I have figured out) and then compute the group mean. I read that you should create a function (here, section 6.3.1), but my function keeps failing. How would I do this?

Also, please use nest or nest_by in the solution. I know I could use group_by(), like someone else did here, but in my actual data, I need these to be 3 separate datasets due to other computations that I need to do.

#Here's my setup and sample data
library(dplyr)
library(purrr)
library(tidyr)

set.seed(1414)
test <- tibble(id = c(1:100),
               condition = c(rep(c("pre", "post"), 50)),
               time = c(case_when(condition == "pre" ~ 0,
                                  condition == "post" ~ sample(c(1, 2), size = c(100), replace = TRUE))),
               score = case_when(time == 0 ~ 1,
                                 time == 1 ~ 10,
                                 time == 2 ~ 100))


#Here's what I tried

#Nesting the data (works)
nested_test <- test %>%
  unite(col = "all_combos", c(condition, time)) %>%
  mutate(score2 = mean(score)) %>%
  nest_by(all_combos)

#Make mean function and map it (doesn't work)

my_mean <- function(data) {
  mean(score, na.rm = T)
}

nested_test %>%
  mutate(score3 = map(data, my_mean))

CodePudding user response:

We may need to ungroup as there is rowwise attribute and then loop over the data with map and create the column with mutate on the nested data

library(dplyr)
library(purrr)
nested_test_new <- nested_test %>%
  ungroup %>%
   mutate(data = map(data, ~ .x %>%
    mutate(score3 = mean(score, na.rm = TRUE))))

-output

nested_test_new
# A tibble: 3 × 2
  all_combos data             
  <chr>      <list>           
1 post_1     <tibble [19 × 4]>
2 post_2     <tibble [31 × 4]>
3 pre_0      <tibble [50 × 4]>
> nested_test_new$data
[[1]]
# A tibble: 19 × 4
      id score score2 score3
   <int> <dbl>  <dbl>  <dbl>
 1     2    10   33.4     10
 2     4    10   33.4     10
 3    14    10   33.4     10
 4    16    10   33.4     10
 5    18    10   33.4     10
 6    28    10   33.4     10
 7    30    10   33.4     10
 8    32    10   33.4     10
 9    38    10   33.4     10
10    44    10   33.4     10
11    48    10   33.4     10
12    60    10   33.4     10
13    64    10   33.4     10
14    78    10   33.4     10
15    80    10   33.4     10
16    86    10   33.4     10
17    92    10   33.4     10
18    96    10   33.4     10
19   100    10   33.4     10

[[2]]
# A tibble: 31 × 4
      id score score2 score3
   <int> <dbl>  <dbl>  <dbl>
 1     6   100   33.4    100
 2     8   100   33.4    100
 3    10   100   33.4    100
 4    12   100   33.4    100
...

Or another option is nest_mutate from nplyr

library(nplyr)
test %>%
  unite(col = "all_combos", c(condition, time)) %>%
  mutate(score2 = mean(score)) %>%
  nest(data = -all_combos) %>%
  nest_mutate(data, score3 = mean(score, na.rm = TRUE))

-output

# A tibble: 3 × 2
  all_combos data             
  <chr>      <list>           
1 pre_0      <tibble [50 × 4]>
2 post_1     <tibble [19 × 4]>
3 post_2     <tibble [31 × 4]>
  • Related