Using a generated list of functions for mutate(across vs mutate

I'm working off this answer that describes the use of mutate_at and supplying a list of functions applied to a column. I have modified the code in that answer and have a working example that seems to produce the quantities I am looking for (growth rates of a variable over different intervals):

library(tidyverse)

set.seed(1)

## data
df <- data.frame(t = 1:10, y = runif(10))
lags <- c(1, 3, 5)

df %>% mutate_at(vars(y), .funs = {
  map(lags, function(i) ~ (.x - lag(.x, n = i)) / lag(.x, n = i)) %>%
    setNames(sprintf("growth_%1i", lags))
})

#     t          y    growth_1    growth_3   growth_5
# 1   1 0.26550866          NA          NA         NA
# 2   2 0.37212390  0.40155088          NA         NA
# 3   3 0.57285336  0.53941567          NA         NA
# 4   4 0.90820779  0.58541059  2.42063336         NA
# 5   5 0.20168193 -0.77793415 -0.45802478         NA
# 6   6 0.89838968  3.45448772  0.56827164  2.3836549
# 7   7 0.94467527  0.05152061  0.04015323  1.5386041
# 8   8 0.66079779 -0.30050271  2.27643527  0.1535200
# 9   9 0.62911404 -0.04794772 -0.29973145 -0.3073016
# 10 10 0.06178627 -0.90178844 -0.93459523 -0.6936450

However, since mutate_at has been superseded by the across syntax and for consistency with the rest of my code, I have been trying to get a working version with the new syntax. I have code that runs but doesn't seem to produce the new columns and I haven't been able to figure out why.


df %>% mutate(across(y, .funs = {
  map(lags, function(i) ~ (.x - lag(.x, n = i)) / lag(.x, n = i)) %>%
    setNames(sprintf("growth_%1i", lags))
}))

#     t          y
# 1   1 0.26550866
# 2   2 0.37212390
# 3   3 0.57285336
# 4   4 0.90820779
# 5   5 0.20168193
# 6   6 0.89838968
# 7   7 0.94467527
# 8   8 0.66079779
# 9   9 0.62911404
# 10 10 0.06178627

I had previously tried generating lists of functions outside the mutate call but couldn't get it to work. I thought the issue with the current code might be the placement of parentheses/braces/etc. but adjusting those hasn't resolved the problem. Any insights are appreciated.

CodePudding user response：

It is much easier to do this outside and then bind with the original data instead of creating a list or tibble object in across and then unnesting

library(purrr)
library(stringr)
library(dplyr)
map_dfc(lags, ~ df %>% 
   transmute(!! str_c('growth_', .x) := (y - lag(y, n = .x))/lag(y, n = .x))) %>%
   bind_cols(df, .)

-output

    t         y   growth_1   growth_3   growth_5
1   1 0.8696908         NA         NA         NA
2   2 0.3403490 -0.6086552         NA         NA
3   3 0.4820801  0.4164288         NA         NA
4   4 0.5995658  0.2437058 -0.3105989         NA
5   5 0.4935413 -0.1768355  0.4501036         NA
6   6 0.1862176 -0.6226910 -0.6137206 -0.7858807
7   7 0.8273733  3.4430457  0.3799541  1.4309557
8   8 0.6684667 -0.1920615  0.3544292  0.3866300
9   9 0.7942399  0.1881517  3.2651170  0.3246917
10 10 0.1079436 -0.8640919 -0.8695346 -0.7812876

If we want to use across

library(tidyr)
df %>%
   mutate(across(y,  function(.x) m
    ap_dfc(lags, function(i)  (.x - lag(.x, i))/(lag(.x, i))), 
       .names = "growth")) %>% 
   unnest(growth, names_sep = "_") %>%
   rename_with(~ str_c('growth_', lags), starts_with('growth'))

-output

# A tibble: 10 × 5
       t     y growth_1 growth_3 growth_5
   <int> <dbl>    <dbl>    <dbl>    <dbl>
 1     1 0.870   NA       NA       NA    
 2     2 0.340   -0.609   NA       NA    
 3     3 0.482    0.416   NA       NA    
 4     4 0.600    0.244   -0.311   NA    
 5     5 0.494   -0.177    0.450   NA    
 6     6 0.186   -0.623   -0.614   -0.786
 7     7 0.827    3.44     0.380    1.43 
 8     8 0.668   -0.192    0.354    0.387
 9     9 0.794    0.188    3.27     0.325
10    10 0.108   -0.864   -0.870   -0.781