I'm trying to bootstrap some data that are nested by group:
library(tidyverse)
library(tidymodels)
mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(boots = map(data, ~bootstraps(.x,times=1000, apparent = TRUE)))
I then need to access and process the data in the boots
column in order to calculate the mean mpg, per group. I have tried the following:
calc_mpg_mean <- function(split){
dat <- analysis(split) %>% pull(mpg)
return(tibble(
term = "mean",
estimate = mean(dat),
std.err = sd(dat)/sqrt(length(dat))))
}
mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(boots = map(data, ~bootstraps(.x,times=1000, apparent = TRUE))) %>%
mutate(mean_mpg = map(boots,calc_mpg_mean))
But this doesn't work because I get:
Error in `mutate()`:
! Problem while computing `mean_mpg = map(boots, calc_mean)`.
ℹ The error occurred in group 1: cyl = 4.
Caused by error in `analysis()`:
! `x` should be an `rsplit` object
Presumably I am not accessing the nested column boots
correctly. What am I doing wrong?
CodePudding user response:
I don't have much (any) experience with tidymodels, but it looks like you just need to access your split
object a few levels deeper.
library(tidyverse)
library(tidymodels)
calc_mpg_mean <- function(split, index){
dat <- analysis(split$splits[[index]]) %>% pull(mpg)
return(tibble(
term = "mean",
estimate = mean(dat),
std.err = sd(dat)/sqrt(length(dat))))
}
mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(boots = map(data, ~bootstraps(.x,times=1000, apparent = TRUE))) %>%
mutate(mean_mpg = imap(boots, ~calc_mpg_mean(.x, .y)))
#> # A tibble: 3 × 4
#> # Groups: cyl [3]
#> cyl data boots mean_mpg
#> <dbl> <list> <list> <list>
#> 1 6 <tibble [7 × 10]> <bootstraps [1,001 × 2]> <tibble [1 × 3]>
#> 2 4 <tibble [11 × 10]> <bootstraps [1,001 × 2]> <tibble [1 × 3]>
#> 3 8 <tibble [14 × 10]> <bootstraps [1,001 × 2]> <tibble [1 × 3]>