Home > Software engineering >  Order the plotting of data by the contents of a date column
Order the plotting of data by the contents of a date column

Time:02-12

I have dataset as follows:

df = data.frame(name = c('Ger1', 'Ger2', 'Ger3', 'Ger4', 'Ger5', 'Ger6'),
            issued = c(UKS, USD, UKS, UKS, USD, USD),
            mat = c(2024-01-31, 2023-01-31, 2026-10-22, 2022-07-22, 2029-01-31, 2025-06-07)
            volume = c(0.476, 0.922, 0.580, 1.259, 0.932, 0.417)

I currently plot (and filter) the data using the following code:

plot1<- ggplot(subset(df, issued == "UKS"))  
  geom_bar(stat="identity", aes(x=volume,y=name),fill="#1170aa") 
  theme(title=element_text(size=12),
        panel.background = element_rect(fill='white',color='black'),
        legend.position='right') 
  labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")

I'd like to be able to order this data using the 'mat' column as guide, namely with the data that has the earliest 'mat' date at the top of the Y axis and the most distant 'mat' date at the bottom. Does anyone have any advice on how to achieve this?

Edit: I use grid arrange to plot it against another chart.

grid.arrange(plot1,plot2,ncol=2)

Sadly I get the following error:

Error in `-.POSIXt`(Maturity) : unary '-' is not defined for "POSIXt" objects

CodePudding user response:

You can use stats::reorder() inside aes() to reorder the bars. If a factor is supplied you don't need to supply a FUN, but for a continuous variable like Date you can specify the way to sort. In your data (although you didn't post it this way in the question), it seems your mat variable is POSIXlt. This format cannot be directly operated on as a numeric. Instead, I suggest using POSIXct and then it will work. See ?stats::reorder for more info on how to control this. Another option is to set levels of the factor in the data before passing to ggplot() which might be a better option if you have complex sorting to do.

library(tidyverse)

df <- data.frame(name = c('Ger1', 'Ger2', 'Ger3', 'Ger4', 'Ger5', 'Ger6'),
                issued = c("UKS", "USD", "UKS", "UKS", "USD", "USD"),
                mat = c("2024-01-31", "2023-01-31", "2026-10-22", "2022-07-22", "2029-01-31", "2025-06-07"),
                volume = c(0.476, 0.922, 0.580, 1.259, 0.932, 0.417))

df %>%
  mutate(mat = as.POSIXct(mat)) %>%
  filter(issued == "UKS") %>%
  # mutate(name = fct_reorder(.f = name, .x = mat)) %>% ggplot(aes(volume, name))  
  ggplot(aes(x = volume, y = reorder(x = name, X = mat, FUN = sort)))  
  geom_col(fill = "#1170aa")  
  labs(title = "Total carriage by Volume on the day", x = "Volume", y = "Name")  
    theme(
    title = element_text(size = 12),
    panel.background = element_rect(fill = 'white', color = 'black'),
    legend.position = 'right'
  )

Created on 2022-02-07 by the reprex package (v2.0.1)

CodePudding user response:

So, I was able to fix the ordering issue by appending the mat column data to the name, and then adding:

 scale_y_discrete(limits=rev)

To the end of the ggplot code.

  • Related