Home > Back-end >  ggplot barplot to boxplot
ggplot barplot to boxplot

Time:03-08

I have code to produce good barplot and I'm trying to create a boxplot with same data. The barplot displays the count of "response" across all people (id). I'd like to create a boxplot for each type of "response" to replace the 3 bars. Boxplots should be calculated from the count of that specific "response" for each participant. So far no luck because I'm stuck in how to count the response for each participant.

current code:

df %>%
  ggplot(position = dodge)   
  labs(title= "question")  
  geom_bar(aes(x = response), fill="red")  
  labs(y = "count", x = "responses")  
  scale_y_continuous(breaks=seq(0,100,20), limits = c(0,100)) 

output: enter image description here

data sample:

structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 
4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), 
    response = c(0, 1, 1, 0, 0, 0, 1, -1, 1, -1, 0, 1, -1, 1, 
    0, 0, 0, 0, 1, 1, 1, -1, 0, 1, 0, 1, 1, -1, 0, 1, 1, 1, 0, 
    1, 0, 0, 1, -1, 0, 1, 1, 1, -1, 1, 1, 1, 0, 0, -1, 1, 1, 
    -1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 
    1, 1, 0, 0, 0), iscorrect = c(0, 1, 1, 0, 0, 0, 1, 0, 1, 
    0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 
    0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 
    0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 
    0, 1, 0, 0, 1, 1, 0, 0, 0), min = c(100, 150, 150, 
    50, 50, 50, 150, 100, 100, 100, 50, 100, 50, 150, 150, 150, 
    50, 100, 100, 100, 150, 150, 50, 50, 50, 150, 150, 100, 50, 
    100, 100, 150, 150, 50, 50, 50, 150, 100, 100, 100, 50, 100, 
    50, 150, 150, 150, 50, 100, 100, 100, 150, 150, 50, 50, 50, 
    150, 150, 100, 50, 100, 100, 150, 100, 50, 50, 50, 150, 100, 
    100, 50, 150, 100, 50, 150, 150), max = c(125.4, 180.8, 
    180.8, 62.4, 62.4, 62.4, 180.8, 125.4, 125.4, 125.4, 62.4, 
    125.4, 62.4, 180.8, 180.8, 180.8, 62.4, 125.4, 125.4, 125.4, 
    180.8, 180.8, 62.4, 62.4, 62.4, 180.8, 180.8, 125.4, 62.4, 
    125.4, 125.4, 180.8, 180.8, 62.4, 62.4, 62.4, 180.8, 125.4, 
    125.4, 125.4, 62.4, 125.4, 62.4, 180.8, 180.8, 180.8, 62.4, 
    125.4, 125.4, 125.4, 180.8, 180.8, 62.4, 62.4, 62.4, 180.8, 
    180.8, 125.4, 62.4, 125.4, 125.4, 180.8, 125.4, 62.4, 62.4, 
    62.4, 180.8, 125.4, 125.4, 62.4, 180.8, 125.4, 62.4, 180.8, 
    180.8), time = c(5, 7, 9, 5, 1, 7, 1, 1, 7, 3, 9, 9, 
    3, 5, 3, 1, 9, 5, 1, 7, 9, 3, 5, 7, 1, 5, 7, 3, 3, 9, 5, 
    7, 9, 5, 1, 7, 1, 1, 7, 3, 9, 9, 3, 5, 3, 1, 9, 5, 1, 7, 
    9, 3, 5, 7, 1, 5, 7, 3, 3, 9, 9, 7, 5, 7, 5, 9, 5, 3, 1, 
    1, 9, 7, 3, 3, 1)), row.names = c(NA, -75L), class = c("tbl_df", 
"tbl", "data.frame"))

CodePudding user response:

You can try:

library(dplyr)
library(ggplot2)

df %>% 
  group_by(id, response) %>% 
  count() %>%
  mutate(id = factor(id), response = factor(response)) %>%
  ggplot(aes(response, n))   
  geom_boxplot(fill = "red")  
  scale_y_continuous(name = "Number of responses per participant")

enter image description here

Note that boxplots don't work well for discrete data like small counts (unless your actual data has a far higher number of participants with a far higher count per response)

CodePudding user response:

You can use this code:

data %>% 
  group_by(id) %>%
  count(response) %>%
  mutate(response = as.factor(response)) %>%
  ggplot(aes(x = response, y = n))  
  geom_boxplot(fill = "red")  
  labs(y = "count", x = "responses")

Output:

enter image description here

  • Related