I want to create new variables based on the following rules:
- Var doesn't start with "sym"
- Var doesn't end with "pct"
And the new var is the previous var with a "_ln" string added. This is the dataset (My real dataset has 184 vars, that's why I want a function)
library(dplyr)
library(tidyr)
df <- data.frame(kg_chicken = c(1,2,3,4,5,6),
kg_chicken_pct = c(0.1,0.2,0.3,0.4,0.5,0.6),
sym_kg_chicken = c(-0.25,-0.15,-0.05,0.05,0.15,0.25))
df
kg_chicken kg_chicken_pct sym_kg_chicken
1 1 0.1 -0.25
2 2 0.2 -0.15
3 3 0.3 -0.05
4 4 0.4 0.05
5 5 0.5 0.15
6 6 0.6 0.25
This is what I tried:
df_final <- df %>%
mutate_if(!starts_with("sym") & !ends_with("pct"),~ paste0(.,"_ln") = log(.))
But I get this error:
Error: unexpected '=' in:
"df_final <- df %>%
mutate_if(!starts_with("sym") & !ends_with("pct"),~ paste0(.,"_ln") ="
This is my expected result:
df_final
kg_chicken kg_chicken_pct sym_kg_chicken kg_chicken_ln
1 1 0.1 -0.25 0.000
2 2 0.2 -0.15 0.693
3 3 0.3 -0.05 1.098
4 4 0.4 0.05 1.386
5 5 0.5 0.15 1.609
6 6 0.6 0.25 1.791
Any help will be greatly appreciated (even if it's on rbase).
CodePudding user response:
mutate_if
has been superseded by using across
inside mutate
, so the way to do this now would be:
df %>%
mutate(across(!starts_with("sym") & !ends_with("pct"), .fns = log, .names = "{.col}_ln"))
#> kg_chicken kg_chicken_pct sym_kg_chicken kg_chicken_ln
#> 1 1 0.1 -0.25 0.0000000
#> 2 2 0.2 -0.15 0.6931472
#> 3 3 0.3 -0.05 1.0986123
#> 4 4 0.4 0.05 1.3862944
#> 5 5 0.5 0.15 1.6094379
#> 6 6 0.6 0.25 1.7917595