Home > front end >  R create multiple columns in one mutate command
R create multiple columns in one mutate command

Time:12-14

I have a data.frame like this.

library(tidyverse)

df <- tibble(
  name = rep(c("a", "b"), each = 100),
  value = runif(100*2),
  date = rep(Sys.Date()   days(1:100), 2)
  )

I would like to do something very similar to the code below. Is there a way to create these 10 columns in one go? Basically, I am trying to find out how much does 99th percent quantile change if we remove one observation, and then 2, and then 3 and so on.

df %>%
  nest_by(name) %>%
  mutate(
    q99_lag_0 = data %>% pull(value) %>% quantile(.99),
    q99_lag_1 = data %>% pull(value) %>% tail(-1) %>% quantile(.99),
    q99_lag_2 = data %>% pull(value) %>% tail(-2) %>% quantile(.99),
    q99_lag_3 = data %>% pull(value) %>% tail(-3) %>% quantile(.99),
    q99_lag_4 = data %>% pull(value) %>% tail(-4) %>% quantile(.99),
    q99_lag_5 = data %>% pull(value) %>% tail(-5) %>% quantile(.99),
    q99_lag_6 = data %>% pull(value) %>% tail(-6) %>% quantile(.99),
    q99_lag_7 = data %>% pull(value) %>% tail(-7) %>% quantile(.99),
    q99_lag_8 = data %>% pull(value) %>% tail(-8) %>% quantile(.99),
    q99_lag_9 = data %>% pull(value) %>% tail(-9) %>% quantile(.99),
    q99_lag_10 = data %>% pull(value) %>% tail(-10) %>% quantile(.99)
  )

CodePudding user response:

First, reproducible random data:

library(dplyr)
library(purrr) # map_dfx
set.seed(42)
df <- tibble(
  name = rep(c("a", "b"), each = 100),
  value = runif(100*2),
  date = rep(Sys.Date()   1:100, 2)
)
head(df)
# # A tibble: 6 x 3
#   name  value date      
#   <chr> <dbl> <date>    
# 1 a     0.915 2021-12-14
# 2 a     0.937 2021-12-15
# 3 a     0.286 2021-12-16
# 4 a     0.830 2021-12-17
# 5 a     0.642 2021-12-18
# 6 a     0.519 2021-12-19

Then the call:

df %>%
  nest_by(name) %>%
  mutate(
    q99_lag_0 = quantile(data$value, 0.99),
    map_dfc(-1:-10, ~ tibble("q99_lag_{-.x}" := quantile(tail(data$value, .x), 0.99)))
  ) %>%
  ungroup()
# # A tibble: 2 x 13
#   name                data q99_lag_0 q99_lag_1 q99_lag_2 q99_lag_3 q99_lag_4 q99_lag_5 q99_lag_6 q99_lag_7 q99_lag_8 q99_lag_9 q99_lag_10
#   <chr> <list<tbl_df[,2]>>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>      <dbl>
# 1 a              [100 x 2]     0.983     0.983     0.983     0.983     0.983     0.983     0.983     0.983     0.983     0.983      0.983
# 2 b              [100 x 2]     0.963     0.963     0.963     0.963     0.963     0.963     0.946     0.946     0.946     0.947      0.947
  • Related