Home > Back-end >  Need to create multiple new variables simultaneously using across() in R
Need to create multiple new variables simultaneously using across() in R

Time:04-10

I have dataset where I need to adjust multiple variables for inflation. It looks something like this.

year price1 price2 price3 price4
2003 1.149 1.149 1.163 1.172
2004 1.169 1.164 1.184 1.18
2005 1.167 1.166 1.183 1.178

I need to put these all in a constant format (like 2020 dollars). I can do this pretty easily with the adjust_for_inflation function from the priceR package. However, there are a lot of price variables, so I'd like to create them all automatically. I've been trying to do with across but it isn't working. Here's what I've been trying.

library(tidyverse)
library(priceR)

#this is it done manually, which would take hours

df %>%  mutate(adjusted_price1=adjust_for_inflation(price1,year,"US",to_date = 2020))

#here's my attempt to do it all at once
price.vars <- df %>% select(-year) %>% names()

dollars2020 <- function(x){
  
  y <- adjust_for_inflation(x,year,"US",to_date = 2020)
}

df <- df %>% 
  mutate(across(price.vars, dollars2020,.names ="adjusted_{col}"))

As far as I can tell, this should be spitting out a list of new variables with names like adjusted_price1 and so forth. But it's not working. I'd really appreciate any help anyone could give.

CodePudding user response:

Maybe this works for you. Instead of a custom function, I passed adjust_for_inflation directly into your dplyr line:

Code

library(dplyr)
library(priceR)

price.vars <- df  %>% select(-year) %>% names()

df %>% mutate(across(price.vars, ~ adjust_for_inflation(.x, year, "US", to_date = 2020), .names = "adjusted_{col}"))

Output

# A tibble: 3 x 9
   year price1 price2 price3 price4 adjusted_price1 adjusted_price2 adjusted_price3 adjusted_price4
  <int>  <dbl>  <dbl>  <dbl>  <dbl>           <dbl>           <dbl>           <dbl>           <dbl>
1  2003   1.15   1.15   1.16   1.17            1.62            1.62            1.64            1.65
2  2004   1.17   1.16   1.18   1.18            1.60            1.59            1.62            1.62
3  2005   1.17   1.17   1.18   1.18            1.55            1.55            1.57            1.56

Data

df <- tibble(fread("year    price1  price2  price3  price4
2003    1.149   1.149   1.163   1.172
2004    1.169   1.164   1.184   1.18
2005    1.167   1.166   1.183   1.178"))

CodePudding user response:

The problem is not with your use of across, it's your function. Firstly, you are passing a variable called year to adjust_for_inflation that does not exist. Secondly, your function doesn't return anything. If you change it to:

dollars2020 <- function(x){
  
  adjust_for_inflation(x, 2022,"US",to_date = 2020)
}

You'll get:

df %>% 
  mutate(across(price.vars, dollars2020,.names ="adjusted_{col}"))
#>   year price1 price2 price3 price4 adjusted_price1 adjusted_price2 adjusted_price3 adjusted_price4
#> 1 2003  1.149  1.149  1.163  1.172        1.134999        1.134999        1.148828        1.157719
#> 2 2004  1.169  1.164  1.184  1.180        1.154755        1.149816        1.169572        1.165621
#> 3 2005  1.167  1.166  1.183  1.178        1.152779        1.151792        1.168585        1.163645

CodePudding user response:

Is it just the naming part that doesn't work? if so then change {col} to {.col}

  • Related