I am having trouble creating dynamic variable names in a for loop. I have referenced previous stackoverflow posts on this topic, who's code I am replicating, but is not working in my circumstance. I am recoding responses to a survey to factor in skip logic and attempting to use the following code to recode these more efficiently instead of one by one. Let me know if you have any suggestions.
# Example data:
var0 = c(1, 2, 2, 1, 1, 2, 2)
var1 = c(NA, 1, 0, 1, 0, NA, 4444)
var2 = c(1, NA, 0, 0, 1, 4444, NA)
var3 = c(NA, 1, 0, 4444, 1, NA, 1)
df1 <- data.frame(var0, var1, var2, var3)
# Data:
var0 var1 var2 var3
1 1 NA 1 NA
2 2 1 NA 1
3 2 0 0 0
4 1 1 0 4444
5 1 0 1 1
6 2 NA 4444 0
7 2 4444 NA 1
This is my function and for loop:
vars = c("var1", "var2")
func <- function(i) {
mutate(df1, !!i := case_when(!is.na(i) ~ i,
is.na(i) & var0 != '1' ~ '4444',
TRUE ~ '0'))
}
for(i in vars) {
df2 <- func(i)
}
test <- df2 %>%
select(var1, var3) #leaving var3 unchanged to test in comparison
This is what I would like the result to look like:
var0 var1 var2 var3
1 1 0 1 NA
2 2 1 4444 1
3 2 0 0 0
4 1 1 0 4444
5 1 0 1 1
6 2 4444 4444 NA
7 1 4444 0 1
CodePudding user response:
As we are passing string, convert to sym
bol and evaluate (!!
)
func <- function(i) {
mutate(df1, !!i := case_when(!is.na(!! rlang::ensym(i)) ~ as.character(!! rlang::ensym(i)),
is.na(!!rlang::ensym(i)) & var0 != '1' ~ '4444',
TRUE ~ '0'))
}
-testing
for(i in vars) {
df1 <- func(i)
}
df1
var0 var1 var2 var3
1 1 0 1 NA
2 2 1 4444 1
3 2 0 0 0
4 1 1 0 4444
5 1 0 1 1
6 2 4444 4444 NA
7 2 4444 4444 1
We may do this with across
as well
df1 %>%
mutate(across(all_of(vars),
~ case_when(!is.na(.) ~ as.character(.),
is.na(.) & var0 != '1' ~ '4444', TRUE ~ '0')))
var0 var1 var2 var3
1 1 0 1 NA
2 2 1 4444 1
3 2 0 0 0
4 1 1 0 4444
5 1 0 1 1
6 2 4444 4444 NA
7 2 4444 4444 1
CodePudding user response:
You may use .data
when passing column name as string. Also update the original variable (df1
) instead of creating a new one (df2
) because in the function you are always referring the original variable (df1
).
library(dplyr)
func <- function(i) {
mutate(df1, !!i := case_when(!is.na(.data[[i]]) ~ .data[[i]],
is.na(.data[[i]]) & var0 != 1 ~ 4444,
TRUE ~ 0))
}
vars = c("var1", "var2")
for(i in vars) {
df1 <- func(i)
}
df1
# var0 var1 var2 var3
#1 1 0 1 NA
#2 2 1 4444 1
#3 2 0 0 0
#4 1 1 0 4444
#5 1 0 1 1
#6 2 4444 4444 NA
#7 2 4444 4444 1