I've got a batch of survey data where respondents were able to tick multiple responses in response to a series of related questions. The end goal is to get this into a grouped bar chart with each category (e.g. "Q13_heatwave_sleep") as a group and counts of each unique response within that group as the bars (e.g. 1, 2, 3, 4, 5). I believe I need to use separate_rows()
to get each of the comma delimited values into its own row, and then use pivot_longer()
to get this into long format for ggplot. I'm struggling to get separate_rows()
to go across each column and then work with pivot_longer. Here's a reproducible example:
require(tidyverse)
# Ingest some data
q13_data <- structure(list(Q13_heatwave_sleep = c("1", "4,2", "5", "5,4,3,2",
"5,4", "5", "5", "2", "5,4", "5"), Q13_heatwave_travel = c("1",
"4,2", "4", "5,4,3,2", "1", "2", "1", "2", "5", "5,4"), Q13_heatwave_health = c("1",
"4,2", "2", "5,4,3,2", "2", "3", "1", "2", "5", "3"), Q13_flooded_home = c("1",
"5,3", "2", "3", "4,2", "5", "4", "2", "3", "1"), Q13_flooding_local = c("1",
"3,2", "2", "3", "1", "5", "1", "2", "3", "5"), Q13_flood_risk_relocation = c("1",
"2,1", "2", "1", "2", "1", "4", "1", "3", "1"), Q13_snow_damage = c("1",
"4,3", "2", "1", "1", "3", "1", "1", "1", "4,3"), Q13_snow_disruption = c("1",
"5,4", "5", "5,3", "3", "3", "1", "1", "1", "3"), Q13_water_shortages = c("1",
"4,2", "5", "5,4,3,2", "2", "3", "4", "2", "5", "4"), Q13_food_restrictions = c("1",
"4,3", "4", "1", "2", "1", "4", "1", "1", "5"), Q13_wildfires = c("1",
"5,4", "1", "1", "2", "2", "4", "1", "1", "1")), class = c("rowwise_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -10L), groups = structure(list(
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame")))
q13_data <- separate_rows(q13_data, Q13_heatwave_travel) # totally at a loss here as to how to get this to work across all columns
q13_data %>%
pivot_longer(Q13_heatwave_sleep:Q13_wildfires, names_to = "question", values_to = "resp")
How can I get this dataframe all stretched out and ready to plot?
CodePudding user response:
As Gregor says, pivot first then separate the rows:
q13_data %>%
pivot_longer(cols = everything()) %>%
separate_rows(value, sep = ",")