I have a dataframe with each observation representation an object in the real world I have counted using OpenCV. After some mutations and using tidyverse and lubridate my dataframe looks like this:
> head(odcCountsRoadUsers)
frameId objectClass dayOfWeek ymdhms hourOfDay
1 1133 car 1 2021-12-05 13:45:25 13
2 1159 car 1 2021-12-05 13:45:26 13
3 1243 car 1 2021-12-05 13:45:31 13
4 1280 person 1 2021-12-05 13:45:33 13
5 1305 bus 1 2021-12-05 13:45:34 13
6 1812 person 1 2021-12-05 13:46:03 13
I have data for multiple days, also containing multiple counts for the same dayOfWeek
. What I want to do is create a histogram for each day of the week (7 in total) with the number of observations for each hour of the day. The result should look similar to this (one histogram per day of the week):
My problems are:
- How do I use ggplot2 so that it uses the POSIXct
ymdhms
as x-axis and for the y-axis does an aggregation of the number of observations byhourOfDay
(or directly usingymdhms
if that's possible) and filters bydayOfWeek
? - Since I have multiple days with records I only want the average for a
dayOfWeek
and not simple add the count of all observations for different dates together. How can I do this efficiently? Does it make sense to create a separate dataframe for that before plotting or can this be done easily usingggplot2
?
Bonus question regarding a chance to separate/group by objectClass
, but those two are the most pressing questions I cannot seem to figure out on my own.
Adding the object class as an argument of group_by
allows us for additional stratification:
aggregated_data2 <-
data %>%
mutate(
weekday = wday(time, label = TRUE),
hour = hour(time),
date = date(time)
) %>%
count(objectClass, weekday, date, hour) %>%
# average e.g over all mondays for each object class
group_by(objectClass, weekday, hour) %>%
summarise(n = mean(n))
aggregated_data2 %>%
ggplot(aes(hour, n, color = objectClass))
geom_line()
facet_wrap(~weekday, scales = "free")