I have a dataframe where each row includes arguments that I want to pass into a function iteratively. The function itself returns a dataframe with a few rows. I would like to keep the arguments and results together in one dataframe by applying pmap_df like you can with pmap_dbl inside of a mutate to add a new column with the results from the function. With the code below, I am able to get a column with nested data in it, but every row contains the data for all of the results, not just the ones corresponding to that row.
library(tidyr)
example_function <- function(data, string, ...){
word_one <- paste(data$word_one, string)
word_two <- paste(data$word_two, string)
output <- data_frame(result_words = c(word_one, word_two))
}
fake_data <- tibble(group_id = rep(c(1, 2), each = 3),
word_one = c("hello", "goodbye", "today",
"apple", "banana", "coconut"),
word_two = c("my", "name", "is",
"ellie", "good", "morning"))
test <- fake_data %>%
group_by(group_id) %>%
nest() %>%
mutate(string = "not working") %>%
mutate(final_output = list(purrr::pmap_df(.l = ., .f = example_function)))
The output looks like:
Rows: 2
Columns: 4
Groups: group_id [2]
$ group_id <dbl> 1, 2
$ data <list> [<tbl_df[3 x 2]>], [<tbl_df[3 …
$ string <chr> "not working", "not working"
$ final_output <list> [<tbl_df[12 x 1]>], [<tbl_df[…
What I would like to have would be for each of the final outputs to have only 6 rows in each dataframe, corresponding to the inputs from the nested data column. Is this possible?
CodePudding user response:
With the OP's function, it may be easily done without any pmap
(return the output
from the function)
example_function <- function(data, string, ...){
word_one <- paste(data$word_one, string)
word_two <- paste(data$word_two, string)
output <- data_frame(result_words = c(word_one, word_two))
output
}
As it is a nest_by
, directly apply the function
library(dplyr)
fake_data %>%
nest_by(group_id) %>%
mutate(string = "not working") %>%
mutate(final_output = list(example_function(data, string)))
# A tibble: 2 × 4
# Rowwise: group_id
group_id data string final_output
<dbl> <list<tibble[,2]>> <chr> <list>
1 1 [3 × 2] not working <tibble [6 × 1]>
2 2 [3 × 2] not working <tibble [6 × 1]>
With pmap
, extract the contents as a list
to an object 'x1' then apply the OP's function on the list elements i.e. x1$data
and x1$string
library(purrr)
library(stringr)
fake_data %>%
nest_by(group_id) %>%
mutate(string = "not working") %>%
ungroup %>%
mutate(final_output = pmap(across(-group_id),
~ {
x1 <- list(...)
example_function(x1$data, x1$string)
}))
# A tibble: 2 × 4
group_id data string final_output
<dbl> <list<tibble[,2]>> <chr> <list>
1 1 [3 × 2] not working <tibble [6 × 1]>
2 2 [3 × 2] not working <tibble [6 × 1]>