I have code to produce good barplot and I'm trying to create a boxplot with same data. The barplot displays the count of "response" across all people (id). I'd like to create a boxplot for each type of "response" to replace the 3 bars. Boxplots should be calculated from the count of that specific "response" for each participant. So far no luck because I'm stuck in how to count the response for each participant.
current code:
df %>%
ggplot(position = dodge)
labs(title= "question")
geom_bar(aes(x = response), fill="red")
labs(y = "count", x = "responses")
scale_y_continuous(breaks=seq(0,100,20), limits = c(0,100))
data sample:
structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5),
response = c(0, 1, 1, 0, 0, 0, 1, -1, 1, -1, 0, 1, -1, 1,
0, 0, 0, 0, 1, 1, 1, -1, 0, 1, 0, 1, 1, -1, 0, 1, 1, 1, 0,
1, 0, 0, 1, -1, 0, 1, 1, 1, -1, 1, 1, 1, 0, 0, -1, 1, 1,
-1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0,
1, 1, 0, 0, 0), iscorrect = c(0, 1, 1, 0, 0, 0, 1, 0, 1,
0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0,
0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0,
0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1,
0, 1, 0, 0, 1, 1, 0, 0, 0), min = c(100, 150, 150,
50, 50, 50, 150, 100, 100, 100, 50, 100, 50, 150, 150, 150,
50, 100, 100, 100, 150, 150, 50, 50, 50, 150, 150, 100, 50,
100, 100, 150, 150, 50, 50, 50, 150, 100, 100, 100, 50, 100,
50, 150, 150, 150, 50, 100, 100, 100, 150, 150, 50, 50, 50,
150, 150, 100, 50, 100, 100, 150, 100, 50, 50, 50, 150, 100,
100, 50, 150, 100, 50, 150, 150), max = c(125.4, 180.8,
180.8, 62.4, 62.4, 62.4, 180.8, 125.4, 125.4, 125.4, 62.4,
125.4, 62.4, 180.8, 180.8, 180.8, 62.4, 125.4, 125.4, 125.4,
180.8, 180.8, 62.4, 62.4, 62.4, 180.8, 180.8, 125.4, 62.4,
125.4, 125.4, 180.8, 180.8, 62.4, 62.4, 62.4, 180.8, 125.4,
125.4, 125.4, 62.4, 125.4, 62.4, 180.8, 180.8, 180.8, 62.4,
125.4, 125.4, 125.4, 180.8, 180.8, 62.4, 62.4, 62.4, 180.8,
180.8, 125.4, 62.4, 125.4, 125.4, 180.8, 125.4, 62.4, 62.4,
62.4, 180.8, 125.4, 125.4, 62.4, 180.8, 125.4, 62.4, 180.8,
180.8), time = c(5, 7, 9, 5, 1, 7, 1, 1, 7, 3, 9, 9,
3, 5, 3, 1, 9, 5, 1, 7, 9, 3, 5, 7, 1, 5, 7, 3, 3, 9, 5,
7, 9, 5, 1, 7, 1, 1, 7, 3, 9, 9, 3, 5, 3, 1, 9, 5, 1, 7,
9, 3, 5, 7, 1, 5, 7, 3, 3, 9, 9, 7, 5, 7, 5, 9, 5, 3, 1,
1, 9, 7, 3, 3, 1)), row.names = c(NA, -75L), class = c("tbl_df",
"tbl", "data.frame"))
CodePudding user response:
You can try:
library(dplyr)
library(ggplot2)
df %>%
group_by(id, response) %>%
count() %>%
mutate(id = factor(id), response = factor(response)) %>%
ggplot(aes(response, n))
geom_boxplot(fill = "red")
scale_y_continuous(name = "Number of responses per participant")
Note that boxplots don't work well for discrete data like small counts (unless your actual data has a far higher number of participants with a far higher count per response)
CodePudding user response:
You can use this code:
data %>%
group_by(id) %>%
count(response) %>%
mutate(response = as.factor(response)) %>%
ggplot(aes(x = response, y = n))
geom_boxplot(fill = "red")
labs(y = "count", x = "responses")
Output: