This might be trivial but I can't figure it out and can't find it online. Let's say I survey people asking the reason they did something. Two options: reason 1 and reason 2. They can also pick both options.
data <- data.frame('reason'=c(rep('R1', 5),rep('R2', 3),rep('R1,R2', 4)))
data
reason
1 R1
2 R1
3 R1
4 R1
5 R1
6 R2
7 R2
8 R2
9 R1,R2
10 R1,R2
11 R1,R2
12 R1,R2
I want to plot the answers, but only counting R1 and R2. That is, if they answered R1 and R2 assign 1 count to each. The command,
ggplot(data = data, aes(x = reason)) geom_bar() coord_flip()
would plot the multiple answer cases as a separate category.
What I want instead is R1 to have a count of 5 4=9 and R2 to have a count of 3 4=7, and no R1,R2 category.
I am interested in this because I have real data from a Qualtrics survey
CodePudding user response:
Using tidyr::separate_rows
you could split your reason
column into multiple rows:
data <- data.frame('reason'=c(rep('R1', 5),rep('R2', 3),rep('R1,R2', 4)))
library(tidyr)
library(ggplot2)
data_sep <- data |>
separate_rows(reason)
ggplot(data = data_sep, aes(y = reason))
geom_bar()
CodePudding user response:
You need to do some data management first. Something like:
mutate(data,
R1=grepl('R1', reason),
R2=grepl('R2', reason)) %>%
select(-reason) %>%
pivot_longer(everything(), names_to="reason") %>%
filter(value) %>%
count(reason) %>%
print() %>%
ggplot(aes(x=reason, y=n))
geom_col()
coord_flip()