Home > database >  Plot multiple variable in the same bar plot
Plot multiple variable in the same bar plot

Time:11-26

With my dataframe that looks like this (I have in total 1322 rows) :

My dataframe

I'd like to make a bar plot with the percentage of rating of the CFS score. It should look similar to this :

enter image description here

With this code, I can make a single bar plot for the column cfs_triage :

ggplot(data = df)   
  geom_bar(mapping = aes(x = cfs_triage, y = (..count..)/sum(..count..)))

My very basic barplot

But I can't find out to make one with the three varaibles next to another.

Thank you in advance to all of you that will help me with making this barplot with the percentage of rating for this three variable !(I'm not sure that my explanations are very clear, but I hope that it's the case :))

CodePudding user response:

Your best bet here is to pivot your data into long format. We don't have your data, but we can reproduce a similar data set like this:

set.seed(1)

df <- data.frame(cfs_triage  = sample(10, 1322, TRUE, prob = 1:10), 
                 cfs_silver  = sample(10, 1322, TRUE), 
                 cfs_student = sample(10, 1322, TRUE, prob = 10:1))

df[] <- lapply(df, function(x) { x[sample(1322, 300)] <- NA; x})

Now the dummy data set looks a lot like yours:

head(df)
#>   cfs_triage cfs_silver cfs_student
#> 1          9         NA           1
#> 2          8          4           2
#> 3         NA          8          NA
#> 4         NA         10           9
#> 5          9          5          NA
#> 6          3          1          NA

If we pivot into long format, then we will end up with two columns: one containing the values, and one containing the column name that the value belonged to in the original data frame:

library(tidyverse)

df_long <- df %>%
  pivot_longer(everything())

head(df_long)
#> # A tibble: 6 x 2
#>   name        value
#>   <chr>       <int>
#> 1 cfs_triage      9
#> 2 cfs_silver     NA
#> 3 cfs_student     1
#> 4 cfs_triage      8
#> 5 cfs_silver      4
#> 6 cfs_student     2

This then allows us to plot with value on the x axis, and we can use name as a grouping / fill variable:

ggplot(df_long, aes(value, fill = name))  
  geom_bar(position = 'dodge')  
  scale_fill_grey(name = NULL)  
  theme_bw(base_size = 16)  
  scale_x_continuous(breaks = 1:10)
#> Warning: Removed 900 rows containing non-finite values (`stat_count()`).

Created on 2022-11-25 with enter image description here

library(tidyverse)
library(scales)

df %>% 
  mutate(id = row_number()) %>% 
  pivot_longer(-id) %>% 
  group_by(id) %>% 
  mutate(percent = value/sum(value, na.rm = TRUE)) %>% 
  mutate(percent = ifelse(is.na(percent), 0, percent)) %>% 
  mutate(my_label = str_trim(paste0(format(100 * percent, digits = 1), "%"))) %>% 
  ggplot(aes(x = factor(name), y = percent, fill = factor(name), label = my_label)) 
  geom_col(position = position_dodge()) 
  geom_text(aes(label = my_label), vjust=-1)  
  facet_wrap(. ~ id, nrow=1,  strip.position = "bottom") 
  scale_fill_grey(name = NULL)  
  scale_y_continuous(labels = scales::percent) 
  theme_bw(base_size = 16) 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
  • Related