I am trying use the purrr::map()
in dplyr::summarise()
. The goal is to create summary for each iteration. The map_dfc()
function does this nicely but, as it is in the name, it column-binds the iterations, which requires another modification via pivot_longer()
to get it in the long format and ready for plotting. I also saw that there is a map_dfr()
function, which I was hoping could save me the pivot_longer()
call, and would row-bind the iterations. It also provides an .id
argument to keep track of which iteration has been row-bound (if I understood correctly). However both functions give the same output. Am I doing something wrong? See below for a reproducible example where it can be seen that both outputs (for map_dfc()
and map_dfr()
) are the same.
# packages
# example dataset
tibble(site = rep(c(LETTERS[1:3]), each = 6),
name = rep(c(letters[10:15]), 3),
size = runif(18)) %>%
arrange(site, name) -> d_tibble
#> # A tibble: 6 x 3
#> site name size
#> <chr> <chr> <dbl>
#> 1 A j 0.633
#> 2 A k 0.318
#> 3 A l 0.241
#> 4 A m 0.378
#> 5 A n 0.352
#> 6 A o 0.298
# some custom function that is supposed to calculate "a" for a sequence of "i"'s
test_fct <- function(a, i) {
a ^ i
# create sequence of i's
i_seq <- seq(0, 5, by = 0.1)
d_tibble %>%
group_by(site, name) %>%
summarise(purrr::map_dfc(set_names(i_seq), ~ test_fct(size, .x)), .groups = "drop") -> d_out
#> # A tibble: 6 x 53
#> site name `0` `0.1` `0.2` `0.3` `0.4` `0.5` `0.6` `0.7` `0.8` `0.9` `1`
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A j 1 0.955 0.913 0.872 0.833 0.796 0.760 0.726 0.694 0.663 0.633
#> 2 A k 1 0.892 0.795 0.709 0.632 0.564 0.502 0.448 0.399 0.356 0.318
#> 3 A l 1 0.867 0.752 0.652 0.566 0.491 0.426 0.369 0.320 0.278 0.241
#> 4 A m 1 0.907 0.823 0.747 0.678 0.615 0.558 0.506 0.460 0.417 0.378
#> 5 A n 1 0.901 0.812 0.731 0.659 0.593 0.535 0.482 0.434 0.391 0.352
#> 6 A o 1 0.886 0.785 0.695 0.616 0.546 0.483 0.428 0.379 0.336 0.298
#> # ... with 40 more variables: `1.1` <dbl>, `1.2` <dbl>, `1.3` <dbl>,
#> # `1.4` <dbl>, `1.5` <dbl>, `1.6` <dbl>, `1.7` <dbl>, `1.8` <dbl>,
#> # `1.9` <dbl>, `2` <dbl>, `2.1` <dbl>, `2.2` <dbl>, `2.3` <dbl>, `2.4` <dbl>,
#> # `2.5` <dbl>, `2.6` <dbl>, `2.7` <dbl>, `2.8` <dbl>, `2.9` <dbl>, `3` <dbl>,
#> # `3.1` <dbl>, `3.2` <dbl>, `3.3` <dbl>, `3.4` <dbl>, `3.5` <dbl>,
#> # `3.6` <dbl>, `3.7` <dbl>, `3.8` <dbl>, `3.9` <dbl>, `4` <dbl>, `4.1` <dbl>,
#> # `4.2` <dbl>, `4.3` <dbl>, `4.4` <dbl>, `4.5` <dbl>, `4.6` <dbl>, ...
d_out %>%
pivot_longer(where(is.double), names_to = "names", values_to = "values")
#> # A tibble: 918 x 4
#> site name names values
#> <chr> <chr> <chr> <dbl>
#> 1 A j 0 1
#> 2 A j 0.1 0.955
#> 3 A j 0.2 0.913
#> 4 A j 0.3 0.872
#> 5 A j 0.4 0.833
#> 6 A j 0.5 0.796
#> 7 A j 0.6 0.760
#> 8 A j 0.7 0.726
#> 9 A j 0.8 0.694
#> 10 A j 0.9 0.663
#> # ... with 908 more rows
# now there is also a map_dfr version to row bind to a data frame, which also take a .id argument
d_tibble %>%
group_by(site, name) %>%
~ test_fct(size, .x), .id = "id"), .groups = "drop") -> d_out2
#> # A tibble: 6 x 53
#> site name `0` `0.1` `0.2` `0.3` `0.4` `0.5` `0.6` `0.7` `0.8` `0.9` `1`
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A j 1 0.955 0.913 0.872 0.833 0.796 0.760 0.726 0.694 0.663 0.633
#> 2 A k 1 0.892 0.795 0.709 0.632 0.564 0.502 0.448 0.399 0.356 0.318
#> 3 A l 1 0.867 0.752 0.652 0.566 0.491 0.426 0.369 0.320 0.278 0.241
#> 4 A m 1 0.907 0.823 0.747 0.678 0.615 0.558 0.506 0.460 0.417 0.378
#> 5 A n 1 0.901 0.812 0.731 0.659 0.593 0.535 0.482 0.434 0.391 0.352
#> 6 A o 1 0.886 0.785 0.695 0.616 0.546 0.483 0.428 0.379 0.336 0.298
#> # ... with 40 more variables: `1.1` <dbl>, `1.2` <dbl>, `1.3` <dbl>,
#> # `1.4` <dbl>, `1.5` <dbl>, `1.6` <dbl>, `1.7` <dbl>, `1.8` <dbl>,
#> # `1.9` <dbl>, `2` <dbl>, `2.1` <dbl>, `2.2` <dbl>, `2.3` <dbl>, `2.4` <dbl>,
#> # `2.5` <dbl>, `2.6` <dbl>, `2.7` <dbl>, `2.8` <dbl>, `2.9` <dbl>, `3` <dbl>,
#> # `3.1` <dbl>, `3.2` <dbl>, `3.3` <dbl>, `3.4` <dbl>, `3.5` <dbl>,
#> # `3.6` <dbl>, `3.7` <dbl>, `3.8` <dbl>, `3.9` <dbl>, `4` <dbl>, `4.1` <dbl>,
#> # `4.2` <dbl>, `4.3` <dbl>, `4.4` <dbl>, `4.5` <dbl>, `4.6` <dbl>, ...
Created on 2022-11-11 with reprex v2.0.2
CodePudding user response:
Maybe you are looking for something like this:
d_tibble |>
group_split(site, name) |>
map_dfr(~tibble(site = .x$site,
name = .x$name,
i = i_seq,
val = test_fct(.x$size, i_seq)))
#> # A tibble: 918 x 4
#> site name i val
#> <chr> <chr> <dbl> <dbl>
#> 1 A j 0 1
#> 2 A j 0.1 0.955
#> 3 A j 0.2 0.913
#> 4 A j 0.3 0.872
#> 5 A j 0.4 0.833
#> 6 A j 0.5 0.796
#> 7 A j 0.6 0.760
#> 8 A j 0.7 0.726
#> 9 A j 0.8 0.694
#> 10 A j 0.9 0.663
#> # ... with 908 more rows
is expecting an output of dataframes to row bind. If you split the dataframe by group and then map out your expected output for each group, then map_dfr
will output the correct result.