Home > Software engineering >  In R, how can I get geom_boxplot to display a time series in chronological order?
In R, how can I get geom_boxplot to display a time series in chronological order?

Time:08-19

I cannot get geom_boxplot to display a time series from oldest date to newest date.

Example Data:

set.seed(321)
dat <- data.frame(matrix(ncol = 3, nrow = 216))
colnames(dat)[1:3] <- c("Date","Location","Value")
dat$Value <- round(rnorm(216, 100,75),0)
dat$Location <- rep(c("Location 1","Location 2"), each = 108)
st <- as.Date("2020-01-01")
en <- as.Date("2022-12-31")
dat$Date <- rep(seq.Date(st,en,by = '1 month'),6)
str(dat)

This plot has the x-axis sorted based on the alphabetical order of the month, then year (i.e., Apr 2020, Apr 2021, Apr 2022, Aug 2020, etc).

ggplot(dat, aes(x = format(Date, "%b %Y"), y = Value, fill = Location, group = interaction(Date, Location)))  
  geom_boxplot()  
  labs(x = "")  
  theme_bw()  
  theme(panel.grid = element_blank(),
        text = element_text(size = 16),
        axis.text.x = element_text(size = 14, color = "black", angle = 90, vjust = 0.5, hjust = 1),
        axis.text.y = element_text(size = 14, color = "black"))

How can I get the x-axis to display in chronological order (i.e., Jan 2020, Feb 2020, etc.)?

CodePudding user response:

We could do it this way:

library(ggplot2)

ggplot(dat, aes(x = Date, y = Value, fill = Location, group = interaction(Date, Location)))  
  geom_boxplot()  
  labs(x = "")  
  scale_x_date(date_breaks = "1 month",
               date_labels = "%b %Y")  
  theme_bw()  
  theme(panel.grid = element_blank(),
        text = element_text(size = 16),
        axis.text.x = element_text(size = 14, color = "black", angle = 90, vjust = 0.5, hjust = 1),
        axis.text.y = element_text(size = 14, color = "black"))

enter image description here

CodePudding user response:

For ggplot2 to arrange your data by Date, it's simplest to preserve the x axis data as Date data (which ggplot2 knows how to display on a timeline) instead of as formatted text (which ggplot2 doesn't assign any semantic meaning to -- "Jan 2020" is no different from "Apple"). You can adjust the label formatting with scale_x_date(date_labels = ...)

(It's also possible to sort text values by converting them to "factors," which have the option of being ordered, but then you would lose the relationship to the timeline; gaps in time would not be shown with gaps, for instance.)

library(ggplot2)
ggplot(dat, aes(x = Date, y = Value, fill = Location, group = interaction(Date, Location)))  
  geom_boxplot()  
  scale_x_date(date_labels = "%b %Y")  
  labs(x = "")  
  theme_bw()  
  theme(panel.grid = element_blank(),
        text = element_text(size = 16),
        axis.text.x = element_text(size = 14, color = "black", angle = 90, vjust = 0.5, hjust = 1),
        axis.text.y = element_text(size = 14, color = "black"))

enter image description here

  • Related