I have a tibble and want to compute monthly availability (A), defined as
A = uptime / (uptime downtime),
where (monthly) downtime is end
- start
, by month and uptime is total time (1 month) - downtime. What is the way to compute monthly availability for the year 2019?
This is the sample data
structure(list(start = structure(c(1550048400, 1558008000, 1558703040,
1561032000, 1560945660, 1563451200), tzone = "UTC", class = c("POSIXct",
"POSIXt")), end = structure(c(1550143989, 1558008000, 1558956840,
1561032000, 1560945660, 1563451200), tzone = "GMT", class = c("POSIXct",
"POSIXt"))), row.names = c(NA, -6L), class = c("tbl_df", "tbl",
"data.frame"))
CodePudding user response:
First, you have inconsistent "tzone"
attributes, one is "UTC"
and the other is "GMT"
. It's minor (and slightly noisy), so I'll preempt the noise (though no change in the results):
attr(dat$end, "tzone") <- "UTC"
A helper function:
fun <- function(mon1, mon2, x = dat) {
# if either start/end is between mon1/mon2, include it ...
tmp <- x[with(x, (start >= mon1 & start < mon2) | (end >= mon1 & end < mon2)),] |>
# ... but if start-to-end straddles a month begin/end, then truncate it
transform(
start = pmax(start, mon1),
end = pmin(end, mon2)
)
data.frame(start = mon1, end = mon2) |>
transform(downtime = c(sum(with(tmp, as.numeric(end - start, units = "hours"))), 0)[1]) |>
transform(uptime = as.numeric(mon2 - mon1, units = "hours") - downtime) |>
transform(A = uptime / ( uptime downtime))
}
And the work in base R:
months <- seq(as.POSIXct("2019-01-01 00:00:00", tz="UTC"), by="1 month", length.out=12)
months
# [1] "2019-01-01 UTC" "2019-02-01 UTC" "2019-03-01 UTC" "2019-04-01 UTC" "2019-05-01 UTC" "2019-06-01 UTC" "2019-07-01 UTC"
# [8] "2019-08-01 UTC" "2019-09-01 UTC" "2019-10-01 UTC" "2019-11-01 UTC" "2019-12-01 UTC"
do.call(rbind, Map(fun, months[-12], months[-1]))
# start end downtime uptime A
# 1 2019-01-01 2019-02-01 0.0000 744.0000 1.0000000
# 2 2019-02-01 2019-03-01 26.5525 645.4475 0.9604874
# 3 2019-03-01 2019-04-01 0.0000 744.0000 1.0000000
# 4 2019-04-01 2019-05-01 0.0000 720.0000 1.0000000
# 5 2019-05-01 2019-06-01 70.5000 673.5000 0.9052419
# 6 2019-06-01 2019-07-01 0.0000 720.0000 1.0000000
# 7 2019-07-01 2019-08-01 0.0000 744.0000 1.0000000
# 8 2019-08-01 2019-09-01 0.0000 744.0000 1.0000000
# 9 2019-09-01 2019-10-01 0.0000 720.0000 1.0000000
# 10 2019-10-01 2019-11-01 0.0000 744.0000 1.0000000
# 11 2019-11-01 2019-12-01 0.0000 720.0000 1.0000000
CodePudding user response:
If you are trying to calculate the value of 'A' for each month, then the process would be:
- sum up all the down time in each month
- subtract that from the total time in the month to get the uptime
- divide the uptime by the total time in the month
This is possible using the lubridate
package:
library(lubridate)
library(dplyr)
data <- data %>%
mutate(downtime = end-start,
month = format(end, "%Y-%m %b"),
month_time = ceiling_date(end,
unit = "months") - floor_date(end,
unit = "months")) %>%
group_by(month) %>%
summarise(downtime = sum(downtime),
month_time = month_time[1]) %>%
mutate(uptime = month_time - downtime,
A = as.numeric(uptime) / as.numeric(uptime downtime))