I'm working off this answer that describes the use of mutate_at
and supplying a list of functions applied to a column. I have modified the code in that answer and have a working example that seems to produce the quantities I am looking for (growth rates of a variable over different intervals):
library(tidyverse)
set.seed(1)
## data
df <- data.frame(t = 1:10, y = runif(10))
lags <- c(1, 3, 5)
df %>% mutate_at(vars(y), .funs = {
map(lags, function(i) ~ (.x - lag(.x, n = i)) / lag(.x, n = i)) %>%
setNames(sprintf("growth_%1i", lags))
})
# t y growth_1 growth_3 growth_5
# 1 1 0.26550866 NA NA NA
# 2 2 0.37212390 0.40155088 NA NA
# 3 3 0.57285336 0.53941567 NA NA
# 4 4 0.90820779 0.58541059 2.42063336 NA
# 5 5 0.20168193 -0.77793415 -0.45802478 NA
# 6 6 0.89838968 3.45448772 0.56827164 2.3836549
# 7 7 0.94467527 0.05152061 0.04015323 1.5386041
# 8 8 0.66079779 -0.30050271 2.27643527 0.1535200
# 9 9 0.62911404 -0.04794772 -0.29973145 -0.3073016
# 10 10 0.06178627 -0.90178844 -0.93459523 -0.6936450
However, since mutate_at
has been superseded by the across
syntax and for consistency with the rest of my code, I have been trying to get a working version with the new syntax. I have code that runs but doesn't seem to produce the new columns and I haven't been able to figure out why.
df %>% mutate(across(y, .funs = {
map(lags, function(i) ~ (.x - lag(.x, n = i)) / lag(.x, n = i)) %>%
setNames(sprintf("growth_%1i", lags))
}))
# t y
# 1 1 0.26550866
# 2 2 0.37212390
# 3 3 0.57285336
# 4 4 0.90820779
# 5 5 0.20168193
# 6 6 0.89838968
# 7 7 0.94467527
# 8 8 0.66079779
# 9 9 0.62911404
# 10 10 0.06178627
I had previously tried generating lists of functions outside the mutate
call but couldn't get it to work. I thought the issue with the current code might be the placement of parentheses/braces/etc. but adjusting those hasn't resolved the problem. Any insights are appreciated.
CodePudding user response:
It is much easier to do this outside and then bind with the original data instead of creating a list or tibble object in across
and then unnest
ing
library(purrr)
library(stringr)
library(dplyr)
map_dfc(lags, ~ df %>%
transmute(!! str_c('growth_', .x) := (y - lag(y, n = .x))/lag(y, n = .x))) %>%
bind_cols(df, .)
-output
t y growth_1 growth_3 growth_5
1 1 0.8696908 NA NA NA
2 2 0.3403490 -0.6086552 NA NA
3 3 0.4820801 0.4164288 NA NA
4 4 0.5995658 0.2437058 -0.3105989 NA
5 5 0.4935413 -0.1768355 0.4501036 NA
6 6 0.1862176 -0.6226910 -0.6137206 -0.7858807
7 7 0.8273733 3.4430457 0.3799541 1.4309557
8 8 0.6684667 -0.1920615 0.3544292 0.3866300
9 9 0.7942399 0.1881517 3.2651170 0.3246917
10 10 0.1079436 -0.8640919 -0.8695346 -0.7812876
If we want to use across
library(tidyr)
df %>%
mutate(across(y, function(.x) m
ap_dfc(lags, function(i) (.x - lag(.x, i))/(lag(.x, i))),
.names = "growth")) %>%
unnest(growth, names_sep = "_") %>%
rename_with(~ str_c('growth_', lags), starts_with('growth'))
-output
# A tibble: 10 × 5
t y growth_1 growth_3 growth_5
<int> <dbl> <dbl> <dbl> <dbl>
1 1 0.870 NA NA NA
2 2 0.340 -0.609 NA NA
3 3 0.482 0.416 NA NA
4 4 0.600 0.244 -0.311 NA
5 5 0.494 -0.177 0.450 NA
6 6 0.186 -0.623 -0.614 -0.786
7 7 0.827 3.44 0.380 1.43
8 8 0.668 -0.192 0.354 0.387
9 9 0.794 0.188 3.27 0.325
10 10 0.108 -0.864 -0.870 -0.781