Home > Net >  Best way to apply value labels to multiple columns (survey questions) and calculate statistics?
Best way to apply value labels to multiple columns (survey questions) and calculate statistics?

Time:03-15

I have a dataset with survey questionnaire responses on a 5-point scale (1 = strongly disagree; 5 = strongly agree).

df <- data.frame(q1 = c(1, 1, 2, 3),
                 q2 = c(2, 1, 4, 5))

Question 1: What is the best way to assign value labels to multiple columns to aid with interpretation, but allow me to calculate statistics (e.g., mean)? I tried mutating the column using the following and it works, but I'm unable to calculate the mean.

df <- data.frame(q1 = c(1, 1, 6, 3),
                 q2 = c(2, 1, 4, 5))
df <- df %>% 
  mutate(q1 = ordered(q1,
                       levels = 1:6,
                       labels = c(
                         "Strongly disagree",
                         "Somewhat disagree",
                         "Neutral",
                         "Somewhat agree",
                         "Strongly agree",
                         "Don't know")))
mean(df$q1)

Warning message: In mean.default(df$q1) : argument is not numeric or logical: returning NA

Question 2: How do I apply value labels to multiple columns? I tried using this code, but it gave me an error message:

df <- df %>% 
  mutate(c(q1, q2) = ordered(c(q1,q2),
                       levels = 1:6,
                       labels = c(
                         "1-5 min",
                         "6-10 min",
                         "11-20 min",
                         "21-30 min",
                         "31  min",
                         "Don't know/Not applicable")))
mean(df$q1)

Error: unexpected ')' in: " "Strongly agree", "Don't know"))"

CodePudding user response:

You can use the labelled package, use across to apply a function to multiple columns.

library(labelled)
library(dplyr)

df1 <- df %>%
  mutate(across(c(q1, q2), ~labelled(., labels = c(
    "Strongly disagree" = 1,
    "Somewhat disagree" = 2,
    "Neutral" = 3,
    "Somewhat agree" = 4,
    "Strongly agree" = 5,
    "Don't know" = 6))))

df1$q1

#<labelled<double>[4]>
#[1] 1 1 6 3

#Labels:
# value             label
#     1 Strongly disagree
#     2 Somewhat disagree
#     3           Neutral
#     4    Somewhat agree
#     5    Strongly agree
#     6        Don't know

mean(df1$q1)
#[1] 2.75

CodePudding user response:

Using

levs <- c("Strongly disagree", "Somewhat disagree", "Neutral", "Somewhat agree", "Strongly agree", "Don't know")

Some options:

  1. Strings (not the order you want):

    df %>%
      mutate(across(everything(), ~ levs[.]))
    #                  q1                q2
    # 1 Strongly disagree Somewhat disagree
    # 2 Strongly disagree Strongly disagree
    # 3        Don't know    Somewhat agree
    # 4           Neutral    Strongly agree
    
  2. Factors (ordered):

    df %>%
      mutate(across(everything(), ~ factor(levs[.], levels = levs)))
    #                  q1                q2
    # 1 Strongly disagree Somewhat disagree
    # 2 Strongly disagree Strongly disagree
    # 3        Don't know    Somewhat agree
    # 4           Neutral    Strongly agree
    
  •  Tags:  
  • r
  • Related