I play an online card game with friends and keep a lot of statistics on the games. Now I am interested to essentially evaluate if we play better or worse as it gets later in the evening. May actual dataset is much larger, but I use this example to illustrate: I have a dataframe with two variables: a time variable (in hms format) and a logical vector showing if a game played at that time was won (TRUE) or lost (FALSE).
library(hms)
win<-(c(TRUE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE))
time<-as_hms(c("20:30:51","20:36:11","20:41:31","20:51:53","20:57:40","21:02:00","21:06:26","21:08:57","21:13:03","21:20:30","21:25:15","21:29:31","21:41:55","21:45:38","21:50:02","21:53:22","21:57:30","22:02:08"))
myframe <- data.frame(win, time)
Now I want to group the time, for example by hours, and calculate the win/loss ratio within these time slot: For example: There were five games played from 20-21 h, and 3 were won and 2 were lost. There were 12 games played from 21-22h of which 8 were won and 4 lost. Just one game after 22h, which was won.
I would like to have a plot with the time slots on the x-axis: 20-21h, 21-22h, 22-23h And on the y axis I would like to plot the win ratio, that is 0.60, 0.67, 1.0. I can calculate this ratio by sum(win==TRUE)/length(win). But how do I do this for defined time slots and then do such a plot?
I also want to be able to do this with customary, shorter, time slots, like 10 or 15 minutes. And be able to plot other data than just the win ratio vs. time slots.
CodePudding user response:
This is not the best option, but it can help you further.
library(tidyverse)
myframe %>%
mutate(hour = sub("\\:.*", "", time),
timeslot = ifelse(hour == "20", "20-21h",
ifelse(hour == "21", "21-22h", "22-23h"))) %>%
group_by(timeslot) %>%
mutate(win_ratio = sum(win == TRUE)/length(win)) %>%
ggplot(aes(x = timeslot, y = win_ratio))
geom_point()
labs(y = "Win ratio")
theme_bw()
Output:
CodePudding user response:
Using floor_hms
to round the time units:
library(tidyverse)
library(lubridateExtras)
library(scales)
library(hms)
# Data
win <- c(TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE,
TRUE, FALSE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE)
time <- as_hms(c("20:30:51", "20:36:11", "20:41:31", "20:51:53", "20:57:40",
"21:02:00", "21:06:26", "21:08:57", "21:13:03", "21:20:30", "21:25:15",
"21:29:31", "21:41:55", "21:45:38", "21:50:02", "21:53:22", "21:57:30", "22:02:08"))
myframe <- data.frame(win, time)
# Plot
myframe |>
mutate(time = floor_hms(time, unit = "hour")) |>
group_by(time) |>
summarise(win_rate = sum(win) / n()) |>
ggplot(aes(time, win_rate))
geom_col()
geom_label(aes(label = percent(win_rate)))
scale_x_time()
scale_y_continuous(labels = label_percent())
Created on 2022-06-09 by the reprex package (v2.0.1)