Home > Mobile >  Sum total hours by date in R
Sum total hours by date in R

Time:10-07

I would like to know how to sum the total hours by day using a condiction that only a time assigned with an X in the third column should be sum. I have this data.frame:

Date Time Run
17/04/12 00:10:00 x
17/04/12 00:19:00 x
17/04/12 00:25:00 x
17/04/12 00:29:00 x
17/04/12 00:25:00
17/04/12 00:29:00 x
17/04/12 00:30:00 x
18/04/12 00:10:00
18/04/12 00:14:00 x
18/04/12 00:20:00 x

Result:

Date Total_Time
17/04/12 00:20:00
18/04/12 00:06:00

I tried using lubridate and tidyverse packages, but I have had no success.

Thanks

CodePudding user response:

I am not sure if this is what you want to do...

library(lubridate)
library(dplyr)

df %>% 
  mutate(
    Date = dmy(Date),
    Time = hms(Time),
    # minute(Time)
  ) %>% 
  filter(Run == "x") %>% 
  group_by(Date) %>% 
  summarise(Total_Time = seconds_to_period(sum(period_to_seconds(Time))))

# A tibble: 2 x 2
  Date       Total_Time
  <date>     <Period>  
1 2012-04-17 1H 39M 0S 
2 2012-04-18 34M 0S 

CodePudding user response:

First, we can discard all time-points not marked as X.

Then, assuming there are leaps between two consecutive runs maked as X, we can sum up the durations of every leap per day:

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

data <- tribble(
  ~Date ,   ~Time,  ~Run,
  "17/04/12",   "00:10:00" ,    NA,
  "17/04/12",   "00:20:00" ,    "x",
  "17/04/12" ,  "00:25:00" ,    "x",
  "17/04/12",   "00:29:00" ,    NA,
  "17/04/12",   "00:25:00" ,    "x",
  "17/04/12",   "00:29:00" ,    "x",
  "18/04/12",   "00:10:00" ,    NA,
  "18/04/12",   "00:14:00" ,    "x",
  "18/04/12" ,  "00:20:00" ,    "x"
)
data %>%
  mutate(date = Date) %>%
  filter(! is.na(Run)) %>%
  # make year to 4 digits
  unite(Date, Time, col = "datetime", sep = "00 - ") %>%
  mutate(
    date,
    leap = as.integer(row_number() / 2 - 0.1),
    datetime = datetime %>% parse_datetime(format = "%d/%M/%Y - %H:%M:%S"),
  ) %>%
  group_by(date, leap) %>%
  mutate(duration = abs(lag(datetime) - datetime)) %>%
  group_by(date) %>%
  summarise(duration = sum(duration, na.rm = TRUE))
#> # A tibble: 2 x 2
#>   date     duration
#>   <chr>    <drtn>  
#> 1 17/04/12 9 mins  
#> 2 18/04/12 6 mins

Created on 2021-10-07 by the reprex package (v2.0.1)

  • Related