Home > Software engineering >  Create a histogram showing month and year
Create a histogram showing month and year

Time:08-09

I can create only a histogram with month or year seperately but not with month and year at the same time in the x-axis.

This is the data I have

Year Month stroke_victims
2005 1 40
2005 2 50
2005 3 48
2005 4 52
2005 5 32
2005 6 21
2005 7 23
2005 8 56
2005 9 43
2005 10 11
2005 11 12
2005 12 15
2006 1 40
2006 2 50
2006 3 48
2006 4 52
2006 5 32
2006 6 21
2006 7 23
2006 8 56
2006 9 43
2006 10 11
2006 11 12
2006 12 15
2007 1 40
2007 2 50
2007 3 48
2007 4 52
2007 5 32
2007 6 21
2007 7 23
2007 8 56
2007 9 43
2007 10 11
2007 11 12
2007 12 15

For year the code is like this

ggplot(stroke,aes(x=Year,y=stroke_victims))   geom_bar(stat="identity")   ggtitle("Stroke victim patterns across the year 2005-2007")

For year the code is like this

ggplot(stroke,aes(x=Month,y=stroke_victims))   geom_bar(stat="identity")   ggtitle("Stroke victim patterns by Months")

Is it possible to create a code which show january 2005, then february 2005 and so on and with the month name?

CodePudding user response:

There are two ways to get the month name:

stroke$Month_text <- month.abb[stroke$Month]
head(stroke)
#   Year Month stroke_victims Month_text
# 1 2005     1             40        Jan
# 2 2005     2             50        Feb
# 3 2005     3             48        Mar
# 4 2005     4             52        Apr
# 5 2005     5             32        May
# 6 2005     6             21        Jun

or one can convert it to a date and use format(.., format="%B") (see ?strptime for %-codes). I'll use month.abb for now.

From here, we can facet on Month_text. However, it's going to be sorted lexicographically which is incorrect:

ggplot(stroke,aes(x=Month_text,y=stroke_victims))  
  geom_bar(stat="identity")  
  ggtitle("Stroke victim patterns by Months")  
  facet_grid(Year ~ .)

ggplot with incorrect month sort

With any question in ggplot with correct sorted months


Data

stroke <- structure(list(Year = c(2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2005L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2006L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L, 2007L), Month = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), stroke_victims = c(40L, 50L, 48L, 52L, 32L, 21L, 23L, 56L, 43L,  11L, 12L, 15L, 40L, 50L, 48L, 52L, 32L, 21L, 23L, 56L, 43L, 11L, 12L, 15L, 40L, 50L, 48L, 52L, 32L, 21L, 23L, 56L, 43L, 11L, 12L, 15L), Month_text = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), .Label = c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"), class = "factor")), row.names = c(NA, -36L), class = "data.frame")

CodePudding user response:

If you want it as a continuous graph, see below. Otherwise use facet_grid like in r2evans answer.

library(dplyr)
library(lubridate)
library(forcats)

stroke %>% 
  mutate(year_month_lab = forcats::as_factor(paste(Year, month.abb[Month], sep = " "))) %>% 
  ggplot(aes(x=year_month_lab, y=stroke_victims))   
  geom_bar(stat="identity")   
  ggtitle("Stroke victim patterns by Year Month")   
  theme(axis.text.x=element_text(angle=90,hjust=1,vjust=0.5))

enter image description here

  • Related