Home > Software design >  Ordering axes and making data more presentable
Ordering axes and making data more presentable

Time:11-26

I am trying to order the time and date axes on my scatter plot into epochs/ time periods. For example, times between 12pm-:7:59pm and 9pm-11:59pm. I want to do something similar for the dates.

I am fairly new to R so I am just looking for suggestions/ to be told if this is even possible and maybe some alternatives:)

This is my code so far:

accident <- read.csv("accidents.csv",header = TRUE)
accident <- accident %>%
ggplot(data=accident) 
  geom_point(mapping=aes(x=Time, y=Date, alpha=0.5))

enter image description here

Thank you!

CodePudding user response:

Welcome to R! Here is one set of options.

library(tidyverse)
library(lubridate)

First, simulate dataset

accident <- 
  rnorm(n = 1000, mean = 1500000000, sd = 1000000) %>% 
  tibble(date_time = .) %>% 
  mutate(date_time = as.POSIXct(date_time, origin = "1970-01-01")) %>% 
  separate(date_time, into = c("date", "time"), sep = " ", remove = F)

Original plot:

accident %>%
  ggplot() 
  geom_point(aes(x=time, y=date), alpha=0.5)

Step 1: Collapse the x axis into smaller number of groups

accidents_per_trihour <- 
  accident %>%
  mutate(hour = floor_date(date_time, unit = "hour"), 
         hour = as.numeric(str_sub(hour, 12,13)), 
         tri_hour = cut(hour, c(0, 3, 6, 9, 12, 15, 18, 21, 24), include.lowest = T)) %>% 
  group_by(date, tri_hour) %>% 
  count() 

Then scale dot size by number of accidents

accidents_per_trihour %>% 
  ggplot() 
  geom_point(aes(x=tri_hour, y=date, size = n), alpha=0.5)   
  labs(x = "\nTime (in three-hour groups)", y = "Day\n", size = "Accidents count")

Still not great because the y axis is too expansive. So:

Step 2: Collapse the y axis into smaller number of groups

(For your data you may need to group into months for things to start to look reasonable)

accidents_per_trihour_per_week <- 
  accident %>%
  mutate(hour = floor_date(date_time, unit = "hour"), 
         hour = as.numeric(str_sub(hour, 12,13)), 
         tri_hour = cut(hour, c(0, 3, 6, 9, 12, 15, 18, 21, 24), include.lowest = T)) %>% 
  mutate(week_start = floor_date(as.Date(date), unit = "weeks"), 
         week = format.Date(week_start, "%Y, week %W"))  %>% 
  group_by(week, tri_hour) %>% 
  count() 

Should be much more readable now We’ll improve the theme as well, just because.

if (!require(ggthemr)) devtools::install_github('cttobin/ggthemr')
ggthemr::ggthemr("flat") ## helps with pretty theming

accidents_per_trihour_per_week %>% 
  ggplot() 
  geom_point(aes(x=tri_hour, y=week, size = n), alpha = 0.9)   
  labs(x = "\nTime (in three-hour groups)", y = "Week\n", size = "Accidents count")

Could also do a tile plot

accidents_per_trihour_per_week %>% 
  ggplot()  
  geom_tile(aes(x = tri_hour, y = week, fill = n))   
  geom_label(aes(x = tri_hour, y = week, label = n), alpha = 0.4, size = 2.5, fontface = "bold")  
  labs(x = "\nTime (in three-hour groups)", y = "Week\n", fill = "Accidents count")

Created on 2021-11-24 by the reprex package (v2.0.1)

  • Related