Home > Back-end >  What is the best way to use lapply with factor() to parse survey data?
What is the best way to use lapply with factor() to parse survey data?


I'm trying to streamline my R code with functions and lapply() loops, and have gotten stuck trying to convert a block of code which ingests survey data to a set of clustered likert style questions. Here's the original working code:


# Generate a data compliments of dput
q25_data <- structure(list(Q25_self_and_family = c(4, 2, 3, 3, 5, 3), Q25_local_area = c(3, 
3, 3, 3, 5, 3), Q25_uk = c(4, 3, 3, 3, 5, 2), Q25_outside_uk = c(4, 
4, 3, 3, 5, 4)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

# Set up levels text for question responses
q25_levels <- c("not at all serious", "somewhat serious", "moderately serious", "Somewhat Agree", "extremely serious")
q25_data$Q25_self_and_family <- factor(q25_data$Q25_self_and_family, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))
q25_data$Q25_local_area <- factor(q25_data$Q25_local_area, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))
q25_data$Q25_uk <- factor(q25_data$Q25_uk, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))
q25_data$Q25_outside_uk <- factor(q25_data$Q25_outside_uk, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))
# Change factor names to match level text
q25_data$Q25_self_and_family <- fct_recode(q25_data$Q25_self_and_family, "not at all serious" = "1", "somewhat serious" = "2", "moderately serious" = "3", "very serious" = "4", "extremely serious" = "5")
q25_data$Q25_local_area <- fct_recode(q25_data$Q25_local_area, "not at all serious" = "1", "somewhat serious" = "2", "moderately serious" = "3", "very serious" = "4", "extremely serious" = "5")
q25_data$Q25_uk <- fct_recode(q25_data$Q25_uk, "not at all serious" = "1", "somewhat serious" = "2", "moderately serious" = "3", "very serious" = "4", "extremely serious" = "5")
q25_data$Q25_outside_uk <- fct_recode(q25_data$Q25_outside_uk, "not at all serious" = "1", "somewhat serious" = "2", "moderately serious" = "3", "very serious" = "4", "extremely serious" = "5")
# Change names of rows to question text
names(q25_data) <- c("You and your family in the UK", "People in your local area or city", "The UK as a whole", "Your family and/or friends living outside the UK")
q25_likert_table <- likert(as.data.frame(q25_data))

So what I'm thinking is that I can take in the column names using q25_names <- names(select(q25_data, Q25_self_and_family:Q25_outside_uk)) and then use lapply() with a modified function like this: test <-lapply(q25_names, function(q25_column) {q25_data$q25_column <- factor(select(q25_data, q25_column), ordered = TRUE, levels = c("1", "2", "3", "4", "5"))}) however, I'm getting nowhere fast with this approach. Suspect I'm missing something obvious here, but I've been through a dozen examples on SE and still not finding a successful approach.

CodePudding user response:

You can "streamline" your workflow using pipes, e.g. like shown below. Briefly, you can use across to apply a function to every column, and use factor, which conveniently allows you to set labels to factor levels. Then just pipe the outcome to transform it into a data.frame, and then to a likert object.


q25_data <- tibble(
    Q25_self_and_family = c(4, 2, 3, 3, 5, 3), 
    Q25_local_area = c(3, 3, 3, 3, 5, 3), 
    Q25_uk = c(4, 3, 3, 3, 5, 2), 
    Q25_outside_uk = c(4, 4, 3, 3, 5, 4))

# Set up levels text for question responses
q25_levels <- paste(c("not at all", "somewhat", "moderately", "very", "extremely"),  
q25_likert_table <- q25_data %>% 
        factor, ordered = TRUE, levels = 1:5, labels=q25_levels)) %>% 
    as.data.frame %>% 

Created on 2022-03-07 by the reprex package (v2.0.1)

  • Related