I have a dataset with survey questionnaire responses on a 5-point scale (1 = strongly disagree; 5 = strongly agree).
df <- data.frame(q1 = c(1, 1, 2, 3),
q2 = c(2, 1, 4, 5))
Question 1: What is the best way to assign value labels to multiple columns to aid with interpretation, but allow me to calculate statistics (e.g., mean)? I tried mutating the column using the following and it works, but I'm unable to calculate the mean.
df <- data.frame(q1 = c(1, 1, 6, 3),
q2 = c(2, 1, 4, 5))
df <- df %>%
mutate(q1 = ordered(q1,
levels = 1:6,
labels = c(
"Strongly disagree",
"Somewhat disagree",
"Neutral",
"Somewhat agree",
"Strongly agree",
"Don't know")))
mean(df$q1)
Warning message: In mean.default(df$q1) : argument is not numeric or logical: returning NA
Question 2: How do I apply value labels to multiple columns? I tried using this code, but it gave me an error message:
df <- df %>%
mutate(c(q1, q2) = ordered(c(q1,q2),
levels = 1:6,
labels = c(
"1-5 min",
"6-10 min",
"11-20 min",
"21-30 min",
"31 min",
"Don't know/Not applicable")))
mean(df$q1)
Error: unexpected ')' in: " "Strongly agree", "Don't know"))"
CodePudding user response:
You can use the labelled
package, use across
to apply a function to multiple columns.
library(labelled)
library(dplyr)
df1 <- df %>%
mutate(across(c(q1, q2), ~labelled(., labels = c(
"Strongly disagree" = 1,
"Somewhat disagree" = 2,
"Neutral" = 3,
"Somewhat agree" = 4,
"Strongly agree" = 5,
"Don't know" = 6))))
df1$q1
#<labelled<double>[4]>
#[1] 1 1 6 3
#Labels:
# value label
# 1 Strongly disagree
# 2 Somewhat disagree
# 3 Neutral
# 4 Somewhat agree
# 5 Strongly agree
# 6 Don't know
mean(df1$q1)
#[1] 2.75
CodePudding user response:
Using
levs <- c("Strongly disagree", "Somewhat disagree", "Neutral", "Somewhat agree", "Strongly agree", "Don't know")
Some options:
Strings (not the order you want):
df %>% mutate(across(everything(), ~ levs[.])) # q1 q2 # 1 Strongly disagree Somewhat disagree # 2 Strongly disagree Strongly disagree # 3 Don't know Somewhat agree # 4 Neutral Strongly agree
Factors (ordered):
df %>% mutate(across(everything(), ~ factor(levs[.], levels = levs))) # q1 q2 # 1 Strongly disagree Somewhat disagree # 2 Strongly disagree Strongly disagree # 3 Don't know Somewhat agree # 4 Neutral Strongly agree