Home > Back-end >  Create new column using tidy evaluation on the left and right of mutate in R
Create new column using tidy evaluation on the left and right of mutate in R

Time:02-11

I know there are many questions regarding tidy evaluation in R. However, I couldn't figure out a solution to this seemingly easily problem.

I have this data.frame

structure(list(Date = c("25.02.2020", "26.02.2020", "27.02.2020", 
"28.02.2020", "02.03.2020", "03.03.2020", "04.03.2020", "05.03.2020", 
"06.03.2020", "09.03.2020", "10.03.2020", "11.03.2020", "12.03.2020", 
"13.03.2020", "16.03.2020", "17.03.2020", "18.03.2020", "19.03.2020", 
"20.03.2020", "23.03.2020", "24.03.2020", "25.03.2020", "26.03.2020", 
"27.03.2020", "30.03.2020", "31.03.2020", "01.04.2020", "02.04.2020", 
"03.04.2020", "06.04.2020"), change_AAPL_stocks = c("1", "0,95", 
"0,93", "0,85", "0,94", "1,01", "0,99", "0,98", "0,94", "0,88", 
"0,92", "0,92", "0,85", "0,88", "0,8", "0,82", "0,8", "0,82", 
"0,82", "0,76", "0,79", "0,83", "0,82", "0,84", "0,83", "0,85", 
"0,82", "0,8", "0,81", "0,83"), change_AMZN_stocks = c("1", "0,97", 
"0,95", "0,9", "0,94", "0,97", "0,96", "0,95", "0,93", "0,88", 
"0,92", "0,92", "0,85", "0,87", "0,81", "0,88", "0,86", "0,92", 
"0,95", "0,9", "0,96", "0,95", "0,94", "0,95", "0,95", "0,97", 
"0,95", "0,94", "0,94", "0,96")), row.names = c(NA, -30L), class = c("tbl_df", 
"tbl", "data.frame"))

And I have these variables

date_col = "Date"
date_format = "%d.%m.%Y"
value_col = "change_AAPL_stocks"

And I'd like to write a function that can take arbitrary date_col and date_format values.

The code at the moment looks like this:

  df %>% 
    select(date_col, value_col) %>% 
    mutate(
      {{date_col}} := as.Date({date_col}, format=date_format)
    )

Which creates (overwrites) the column names Date. However, the as.Date(...) function does not work. I am not entirely sure what to do about that.

CodePudding user response:

We can use .data to subset

library(dplyr)
df %>% 
    select(all_of(date_col))%>% 
    mutate(!! date_col := as.Date(.data[[date_col]], format = date_format))

-output

# A tibble: 30 × 1
   Date      
   <date>    
 1 2020-02-25
 2 2020-02-26
 3 2020-02-27
 4 2020-02-28
 5 2020-03-02
 6 2020-03-03
 7 2020-03-04
 8 2020-03-05
 9 2020-03-06
10 2020-03-09
# … with 20 more rows

CodePudding user response:

Use sym and unsplice it on the rhs.

df %>% 
    select(all_of(date_col)) %>% 
    mutate(
        {{date_col}} := as.Date(!!sym(date_col), format=date_format)
    )
# A tibble: 30 x 1
   Date      
   <date>    
 1 2020-02-25
 2 2020-02-26
 3 2020-02-27
 4 2020-02-28
 5 2020-03-02
 6 2020-03-03
 7 2020-03-04
 8 2020-03-05
 9 2020-03-06
10 2020-03-09
# ... with 20 more rows

You can generalize the function to take symbol or character as input as follows:

f <- function(data, date_col){
    if(rlang::is_symbol(date_col)){
        rhs <- enquo(date_col)
        date_col <- as.character(date_col)
    }else{
        rhs <- sym(date_col)
    }
    data %>% 
        select(date_col) %>% 
        mutate(
            {{date_col}} := as.Date(!!rhs, format=date_format)
        )
}
df %>% 
    f('Date')
# A tibble: 30 x 1
   Date      
   <date>    
 1 2020-02-25
 2 2020-02-26
 3 2020-02-27
 4 2020-02-28
 5 2020-03-02
 6 2020-03-03
 7 2020-03-04
 8 2020-03-05
 9 2020-03-06
10 2020-03-09
# ... with 20 more rows
df %>% 
    f(date_col)
# A tibble: 30 x 1
   Date      
   <date>    
 1 2020-02-25
 2 2020-02-26
 3 2020-02-27
 4 2020-02-28
 5 2020-03-02
 6 2020-03-03
 7 2020-03-04
 8 2020-03-05
 9 2020-03-06
10 2020-03-09
# ... with 20 more rows
  • Related