I would like to generate a wide data frame of simulated observations using the stats::simulate
function and the purrr
package. Here's the basic idea:
library(tidyverse)
mod <- lm(mpg ~ cyl disp, data = mtcars)
m_sim <- function(s) {
stats::simulate(mod, nsim = 1, newdata = mtcars, seed = s)
}
df <- map_dfr(1:3, m_sim)
The problem is that this code generates s rows for each observation in the data set. For example, where s = 3 (as above):
df %>%
rownames_to_column(var = "car") %>%
dplyr::filter(str_sub(car, 1, 10) == "Datsun 710")
#> car sim_1
#> 1 Datsun 710...3 23.53562
#> 2 Datsun 710...35 30.94046
#> 3 Datsun 710...67 26.87957
How can I fix the code so that the simulations appear as columns? Here, the desired output would be:
#> car sim_1 sim_2 sim_3
#> 1 Datsun 710 23.53562 30.94046 26.87957
CodePudding user response:
Instead of _dfr
, use _dfc
or the more recent list_cbind
library(purrr)
library(dplyr)
library(stringr)
map(1:3, m_sim) %>%
list_cbind%>%
rownames_to_column(var = "car") %>%
dplyr::filter(str_sub(car, 1, 10) == "Datsun 710")
CodePudding user response:
You can add a line to the function to rename the column using the seed, then use map_dfc()
to column-bind:
library(dplyr)
library(purrr)
library(stringr)
m_sim <- function(s) {
stats::simulate(mod, nsim = 1, newdata = mtcars, seed = s) %>%
rename("sim_{s}" := sim_1)
}
map_dfc(1:3, m_sim) %>%
rownames_to_column(var = "car") %>%
dplyr::filter(str_sub(car, 1, 10) == "Datsun 710")
car sim_1 sim_2 sim_3
1 Datsun 710 23.53562 30.94046 26.87957