I'm trying to use dplyr::mutate()
in combination with purrr::map()
to create a "recipe" object with recipes::recipe()
.
If I do it out of tibble context this works fine:
library(rsample)
library(recipe)
iris_split <- initial_split(iris, prop = 0.6)
data_set_training <- training(iris_split)
recipe_standalone <- recipe(x = data_set_training, Species ~ .) # works
By contrast:
library(tibble)
library(dplyr)
library(purrr)
library(tidyr)
tibble(subset_training = data_set_training) %>%
nest(subset_training = subset_training) %>%
mutate(iris_recipe = map(.x = subset_training, .f = ~recipe(x = .x, Species ~ .))) # doesn't work
Error: Problem with
mutate()
columniris_recipe
.
iiris_recipe = map(.x = subset_training, .f = ~recipe(x = .x, Species ~ .))
.
x object 'Species' not found
How can I use map()
to create a new list-column that contains the "recipe" object?
Desired output
To demonstrate, I want to get this exactly:
desired_output <-
tibble(subset_training = list(data_set_training),
iris_recipe = list(recipe_standalone))
## # A tibble: 1 x 2
## subset_training iris_recipe
## <list> <list>
## 1 <df [90 x 5]> <recipe>
CodePudding user response:
You've created funky structure the way you are nesting. You have put a dataframe as a column and then nested it, so pulling it, you actually just still have this strange 90x1
data frame column.
tibble(subset_training = data_set_training) %>%
nest(subset_training = subset_training) %>%
pull(subset_training) %>%
first()
#> # A tibble: 90 × 1
#> subset_training$Sepal.Length $Sepal.Width $Petal.Length $Petal.Width $Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 6.3 3.3 6 2.5 virgini…
#> 2 6 2.2 4 1 versico…
#> 3 5.7 2.8 4.5 1.3 versico…
#> 4 7.2 3.6 6.1 2.5 virgini…
#> 5 5 3.5 1.3 0.3 setosa
#> 6 5.1 3.8 1.6 0.2 setosa
#> 7 7.2 3.2 6 1.8 virgini…
#> 8 5.7 4.4 1.5 0.4 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 5.2 3.4 1.4 0.2 setosa
#> # … with 80 more rows
Here's how you should be nesting it.
data_set_training %>%
nest(subset_training = everything()) %>%
pull(subset_training) %>%
first()
#> # A tibble: 90 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 6.3 3.3 6 2.5 virginica
#> 2 6 2.2 4 1 versicolor
#> 3 5.7 2.8 4.5 1.3 versicolor
#> 4 7.2 3.6 6.1 2.5 virginica
#> 5 5 3.5 1.3 0.3 setosa
#> 6 5.1 3.8 1.6 0.2 setosa
#> 7 7.2 3.2 6 1.8 virginica
#> 8 5.7 4.4 1.5 0.4 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 5.2 3.4 1.4 0.2 setosa
#> # … with 80 more rows
Then you get the results you're looking for:
data_set_training %>%
nest(subset_training = everything()) %>%
mutate(iris_recipe = map(
.x = subset_training,
.f = ~recipe(x = .x, Species ~ .)
))
#> # A tibble: 1 × 2
#> subset_training iris_recipe
#> <list> <list>
#> 1 <tibble [90 × 5]> <recipe>