I am trying to order the time and date axes on my scatter plot into epochs/ time periods. For example, times between 12pm-:7:59pm and 9pm-11:59pm. I want to do something similar for the dates.
I am fairly new to R so I am just looking for suggestions/ to be told if this is even possible and maybe some alternatives:)
This is my code so far:
accident <- read.csv("accidents.csv",header = TRUE)
accident <- accident %>%
ggplot(data=accident)
geom_point(mapping=aes(x=Time, y=Date, alpha=0.5))
Thank you!
CodePudding user response:
Welcome to R! Here is one set of options.
library(tidyverse)
library(lubridate)
First, simulate dataset
accident <-
rnorm(n = 1000, mean = 1500000000, sd = 1000000) %>%
tibble(date_time = .) %>%
mutate(date_time = as.POSIXct(date_time, origin = "1970-01-01")) %>%
separate(date_time, into = c("date", "time"), sep = " ", remove = F)
Original plot:
accident %>%
ggplot()
geom_point(aes(x=time, y=date), alpha=0.5)
Step 1: Collapse the x axis into smaller number of groups
accidents_per_trihour <-
accident %>%
mutate(hour = floor_date(date_time, unit = "hour"),
hour = as.numeric(str_sub(hour, 12,13)),
tri_hour = cut(hour, c(0, 3, 6, 9, 12, 15, 18, 21, 24), include.lowest = T)) %>%
group_by(date, tri_hour) %>%
count()
Then scale dot size by number of accidents
accidents_per_trihour %>%
ggplot()
geom_point(aes(x=tri_hour, y=date, size = n), alpha=0.5)
labs(x = "\nTime (in three-hour groups)", y = "Day\n", size = "Accidents count")
Still not great because the y axis is too expansive. So:
Step 2: Collapse the y axis into smaller number of groups
(For your data you may need to group into months for things to start to look reasonable)
accidents_per_trihour_per_week <-
accident %>%
mutate(hour = floor_date(date_time, unit = "hour"),
hour = as.numeric(str_sub(hour, 12,13)),
tri_hour = cut(hour, c(0, 3, 6, 9, 12, 15, 18, 21, 24), include.lowest = T)) %>%
mutate(week_start = floor_date(as.Date(date), unit = "weeks"),
week = format.Date(week_start, "%Y, week %W")) %>%
group_by(week, tri_hour) %>%
count()
Should be much more readable now We’ll improve the theme as well, just because.
if (!require(ggthemr)) devtools::install_github('cttobin/ggthemr')
ggthemr::ggthemr("flat") ## helps with pretty theming
accidents_per_trihour_per_week %>%
ggplot()
geom_point(aes(x=tri_hour, y=week, size = n), alpha = 0.9)
labs(x = "\nTime (in three-hour groups)", y = "Week\n", size = "Accidents count")
Could also do a tile plot
accidents_per_trihour_per_week %>%
ggplot()
geom_tile(aes(x = tri_hour, y = week, fill = n))
geom_label(aes(x = tri_hour, y = week, label = n), alpha = 0.4, size = 2.5, fontface = "bold")
labs(x = "\nTime (in three-hour groups)", y = "Week\n", fill = "Accidents count")
Created on 2021-11-24 by the reprex package (v2.0.1)