I have a question on converting time in R.
- First, I need to convert the times stored as characters into numerical. Times are stored as start times in one column, and finish times in another. For example: Work start time: 09:00 and work finish time: 17:00.
- I then want to be able to calculate the time in between these times (i.e. the hours) for multiple rows of data by using a function. I.e. how many hours does someone work on an average day?
- Finally, I want to compare early start times and early finish times to late start times and late finish times. For example, by assigning a category to these times. For example, someone who started work before 10:00 would be classified as "early starter" in one column, someone who started after 10:00 would be classified as "late starter" in another, and then someone who finished work before 17:00 would be classified as "early finisher" in one column, and then someone who finished work after 17:00 would be classified as "late finisher" in another column. Is there a way for R to recognise times in this way, when you don't have a date to assign it to?
All the advice I have read so far seems to be geared towards a particular time within a date. E.g. DD/MM/YY HH:MM. I am only concerned with a daily time.
Thanks in advance.
CodePudding user response:
You can try with lubridate
:
library(lubridate)
set.seed(4)
df <- data.frame(start=paste0(sample(7:11,10,replace = T),".00"),finish=paste0(sample(16:19,10,replace = T),".00"))
df$duration <- hm(df$finish)-hm(df$start)
df$start_cat <- ifelse(hm(df$start)<hm("10.00"),"early_starter","late_starter")
df$finish_cat <- ifelse(hm(df$finish)<hm("17.00"),"early_finisher","late_finishe")
output:
start finish duration start_cat finish_cat
1 9.00 19.00 10H 0M 0S early_starter late_finishe
2 9.00 18.00 9H 0M 0S early_starter late_finishe
3 9.00 18.00 9H 0M 0S early_starter late_finishe
4 10.00 18.00 8H 0M 0S late_starter late_finishe
5 9.00 18.00 9H 0M 0S early_starter late_finishe
6 11.00 16.00 5H 0M 0S late_starter early_finisher
7 8.00 19.00 11H 0M 0S early_starter late_finishe
8 9.00 19.00 10H 0M 0S early_starter late_finishe
9 8.00 16.00 8H 0M 0S early_starter early_finisher
10 7.00 16.00 9H 0M 0S early_starter early_finisher
CodePudding user response:
The easiest way to go about this is to use the hms
package.
Suppose your data is something like this:
df <- data.frame(employee = LETTERS[1:4],
starts = c("08:00", "09:00", "10:00", "11:00"),
finishes = c("16:00", "17:00", "17:30", "18:00"))
df
#> employee starts finishes
#> 1 A 08:00 16:00
#> 2 B 09:00 17:00
#> 3 C 10:00 17:30
#> 4 D 11:00 18:00
Then you can do:
library(hms)
library(dplyr)
df %>% mutate(early_start = as_hms(paste0(starts, ":00")) < as_hms("10:00:00"),
late_finish = as_hms(paste0(finishes, ":00")) > as_hms("17:00:00"))
#> employee starts finishes early_start late_finish
#> 1 A 08:00 16:00 TRUE FALSE
#> 2 B 09:00 17:00 TRUE FALSE
#> 3 C 10:00 17:30 FALSE TRUE
#> 4 D 11:00 18:00 FALSE TRUE
Created on 2021-10-16 by the reprex package (v2.0.0)