Home > other >  How can I parameterize a mutate operation inside a custom function?
How can I parameterize a mutate operation inside a custom function?

Time:10-21

I have several data.frames and I'd like to apply some transformations on theis columns.

What I have done in first place is something like this:

require(dplyr)

df_1 = data.frame(
  'a' = c('aa.aa', 'aa..a/a', 'aaa aa.'),
  'b' = c('b..b/', 'bbb./b', '..bb/--b'),
  'c' = c('ccc', 'cc/cc', 'ccc.-cc')
)
df_1

df_2 = data.frame(
  'a' = c('aa.a..a', '//aa..a/a', 'aaa aa.'),
  'b' = c('b../b/', 'bbb./b', '..bb/--b'),
  'c' = c('cc//c', 'cc/c/c', 'c//cc.-cc')
)
df_2

# df_3, df_4, df_5, ...

# remove '.', ' ', '/', '-'
# replace with '_'

df_new <- df_1 %>%
  mutate(a = toupper(a),
         a = gsub('\\.', '_', a),
         a = gsub('/', '_', a),
         a = gsub(' ', '_', a),
         a = gsub('-', '_', a))
df_new

Output:
> df_new
        a        b       c
1   AA_AA    b..b/     ccc
2 AA__A_A   bbb./b   cc/cc
3 AAA_AA_ ..bb/--b ccc.-cc

I remove every special character from 'a' column in df_1. But I'd like to perform these operations on others columns, so I was thinking on a function like this:

remove_special_characters <- function(df, var) {
  
  df_new <- df %>%
    mutate(var = gsub('\\.', '_', var),
           var = gsub('/', '_', var),
           var = gsub(' ', '_', var),
           var = gsub('-', '_', var))
  
  df_new
}

remove_special_characters(df_1, a)

Output:
Error: Problem with `mutate()` column `var`.
i `var = gsub("\\.", "_", var)`.
x object 'a' not found
Run `rlang::last_error()` to see where the error occurred.

remove_special_characters(df_2, b)
Output
Error: Problem with `mutate()` column `var`.
i `var = gsub("\\.", "_", var)`.
x object 'b' not found
Run `rlang::last_error()` to see where the error occurred.

# ...

But this doesn't work. I looked for the reason and found that mutate function uses data-masking. I searched some solutions, like this:

Use dynamic variable names in `dplyr`

But it does not solve my problem.

Is there any way to create a function that performs this operation?

CodePudding user response:

You would need to evaluate with !!, also assign by :=:

remove_special_characters <- function(df, var) {
  
  df_new <- df %>%
    mutate({{var}} := toupper({{var}}),
           {{var}} := gsub('\\.', '_', {{var}}),
           {{var}} := gsub('/', '_', {{var}}),
           {{var}} := gsub(' ', '_', {{var}}),
           {{var}} := gsub('-', '_', {{var}}))
  
  df_new
}

Ex:

> remove_special_characters(df_1, a)
        a        b       c
1   AA_AA    b..b/     ccc
2 AA__A_A   bbb./b   cc/cc
3 AAA_AA_ ..bb/--b ccc.-cc
> 

CodePudding user response:

You can also use across to apply a function to several variables

Defining a vector of special characters to replace

remove_vec = paste(c("\\.", "/", " ", "-"), collapse = "|")

df_1 %>% 
  mutate(across(.cols = c(a,b,c),.fns = ~ gsub(remove_vec,"_",.)))
  • Related