Home > database >  Visualizing stacked bar chart in the format of Jonathan A. Schwabish (JEP 2014)
Visualizing stacked bar chart in the format of Jonathan A. Schwabish (JEP 2014)

Time:11-21

I am trying to plot the following data (df_input) in the format of a stacked bar graph where we can also see the change over time by line. Any idea how to do it?

df_input <- data.frame( Year= c(2010,2010,2010,2010,2020,2020,2020,2020), village= c("A","B","C","D","A","B","C","D"), share = c(40,30,20,10,30,30,25,15)) 

df_input_2 <- data.frame( Year= c(2010,2010,2010,2010,2015,2015,2015,2015,2020,2020,2020,2020), village= c("A","B","C","D","A","B","C","D","A","B","C","D"), share = c(40,30,20,10,30,30,25,15,20,10,30,40))    

Example

CodePudding user response:

One option to achieve that would be via a geom_col and a geom_line. For the geom_line you have to group by the variable mapped on fill, set position to "stack" and adjust the start/end positions to account for the widths of the bars. Additionally you have to manually set the orientation for the geom_line to y:

library(ggplot2)


width <- .6 # Bar width

ggplot(df_input, aes(share, factor(Year), fill = village))  
  geom_col(width = width)  
  geom_line(aes(x = share, 
                y = as.numeric(factor(Year))   ifelse(Year == 2020, -width / 2, width / 2), 
                group = village), position = "stack", orientation = "y")

EDIT With more than two years things get a bit trickier. In that case I would switch to ´geom_segment`. Additionally we have to do some data wrangling to prepare the data for use with ´geom_segment´:

library(ggplot2)
library(dplyr)

# Example data with three years
df_input_2 <- data.frame( Year= c(2010,2010,2010,2010,2015,2015,2015,2015,2020,2020,2020,2020), village= c("A","B","C","D","A","B","C","D","A","B","C","D"), share = c(40,30,20,10,30,30,25,15,20,10,30,40))    

width = .6

# Data wrangling
df_input_2 <- df_input_2 %>% 
  group_by(Year) %>% 
  arrange(desc(village)) %>% 
  mutate(share_cum = cumsum(share)) %>% 
  group_by(village) %>% 
  arrange(Year) %>% 
  mutate(Year = factor(Year),
         Year_lead = lead(Year), share_cum_lead = lead(share_cum))

ggplot(df_input_2, aes(share, factor(Year), fill = village))  
  geom_col(width = width)  
  geom_segment(aes(x = share_cum, xend = share_cum_lead, y = as.numeric(Year)   width / 2, yend = as.numeric(Year_lead) - width / 2, group = village))
#> Warning: Removed 4 rows containing missing values (geom_segment).

  • Related