I'm trying to convert Likert scale survey data (e.g., "Strongly Agree - 1") into numeric data for use in statistical analysis. I've got dozens of questions using the same scale.
I found a solution, but it seems clumsy and was hoping someone could suggest an improvement for the sake of learning.
df = df %>%
mutate_all(funs(str_replace(.,"Very Dissatisfied1", "1"))) %>%
mutate_all(funs(str_replace(.,"ModeratelyDissatisfied2", "2"))) %>%
mutate_all(funs(str_replace(.,"SlightlyDissatisfied3", "3"))) %>%
mutate_all(funs(str_replace(.,"Neither SatisfiedNor Dissatisfied4", "4"))) %>%
mutate_all(funs(str_replace(.,"SlightlySatisfied5", "5"))) %>%
mutate_all(funs(str_replace(.,"ModeratelySatisfied6", "6"))) %>%
mutate_all(funs(str_replace(.,"VerySatisfied7", "7")))
I'm not sure what funs() is doing here, or to what extent mutate_all can take multiple arguments. How can this code be improved? Thanks for your help.
CodePudding user response:
funs
and mutate_all
are superseded in new dplyr
versions.
In stead we can use the newer implementations:
# Define a set of replacements
# What we want
replacements <- c(
"Very Dissatisfied1",
"ModeratelyDissatisfied2",
"SlightlyDissatisfied3",
"Neither SatisfiedNor Dissatisfied4",
"SlightlySatisfied5",
"ModeratelySatisfied6",
"VerySatisfied7"
) %>%
# What we want to replace
setNames(1:7)
# Then e.g., change them across all character columns
df %>%
mutate(
across(where(is.character), str_replace_all, replacements)
)
CodePudding user response:
Note if the pattern is the same, I mean use the final digit of the replacement
as code to be the numeric value, then we can do:
data.frame(replacements,
code = as.numeric(sub(".*(\\d $)", "\\1", replacements)))
replacements code
1 Very Dissatisfied1 1
2 ModeratelyDissatisfied2 2
3 SlightlyDissatisfied3 3
4 Neither SatisfiedNor Dissatisfied4 4
5 SlightlySatisfied5 5
6 ModeratelySatisfied6 6
7 VerySatisfied7 7
Or even shorter:
data.frame(replacements,
code = as.numeric(sub("\\D ", "", replacements)))
Data comes from @Baraliuh