Home > Enterprise >  How to flexibly supply a varying length argument to .l of pmap()
How to flexibly supply a varying length argument to .l of pmap()

Time:11-05

Below is a pmap() operation that requires my data to be in wide format. I perform a few simulations each day and capture the max value per simulation as post_max.

library(tidyverse)

POST_SIMS <- 2
CONDITIONS <- 3
DURATION <- 2

df <-
    tibble(
        day = rep(1:DURATION, each = CONDITIONS),
        condition = rep(LETTERS[1:CONDITIONS], times = DURATION)
    ) |>
    rowwise() |>
    mutate(post = list(rnorm(POST_SIMS, 0, 1))) |>
    ungroup()

df_wide <- df |> 
    pivot_wider(
        id_cols = c(day), 
        names_from = "condition",
        values_from = 'post'
    ) 

df_wide |> 
    mutate(
        post_max = 
            pmap(
                .l = list(A,B,C), # This works, but needs manual updating
                .f = pmax)
    ) |> 
    unnest()

The problem is that I have to mannually list the unique conditions when I reach pmap(list(A,B,C), pmax) and this in undesirable because my goal is to write a simulation function that can accommodate any number of conditions.

Is there a way to capture the unique conditions generated in df and supply that as an argument to pmap() as I try and fail to do below?

my_conditions <- noquote(unique(df$condition)) 

df_wide |> 
    mutate(
        post_max = 
            pmap(
                .l = list(my_conditions), # How do I do this part? 
                .f = pmax)
    ) |> 
    unnest()

The .l argument supplied to list() is baffling me a bit. This is obviously not a string. I write it as .l = list(A,B,C), which is usually convenient but obscures what pmap() is ingesting. I assume I am dealing with some kind of tidy evaluation, but the flexible nature of this argument's length is different than my typical tidy eval applications where I simply name my columns as quosures.

CodePudding user response:

list(A,B,C) in this context just selects columns A, B & C from mutate() .data argument (df_wide), adding those to a list basically generates a tibble-like structure. Try replacing list(A,B,C) with pick(-day):

glimpse(df_wide)
#> Rows: 2
#> Columns: 4
#> $ day <int> 1, 2
#> $ A   <list> <-1.4857029, -0.2090127>, <-1.6142362, 0.2935161>
#> $ B   <list> <2.610475, -1.604595>, <-1.455556395, 0.003465559>
#> $ C   <list> <-0.06067370, 0.09182582>, <-0.5745877, -1.0695619>

df_wide |> 
  mutate(
    post_max = 
      pmap(
        .l = pick(-day),
        .f = pmax)
  ) |> 
  unnest()
#> Warning: `cols` is now required when using `unnest()`.
#> ℹ Please use `cols = c(A, B, C, post_max)`.
#> # A tibble: 4 × 5
#>     day      A        B       C post_max
#>   <int>  <dbl>    <dbl>   <dbl>    <dbl>
#> 1     1 -1.49   2.61    -0.0607   2.61  
#> 2     1 -0.209 -1.60     0.0918   0.0918
#> 3     2 -1.61  -1.46    -0.575   -0.575 
#> 4     2  0.294  0.00347 -1.07     0.294

rowwise() max(c_across()) should deliver the same result, though I would guess it's bit easier to follow:

df_wide |> 
  unnest_longer(-day) |>
  rowwise() |>
  mutate(post_max = max(c_across(-day))) |>
  ungroup()
  • Related