Home > Blockchain >  Count bizdays that intersect between lubridate intervals in R
Count bizdays that intersect between lubridate intervals in R

Time:10-06

I have a dataset with ~ 330 000 rows. Each observation represents a period where an individual recieved a welfare benefit called "care allowance". The benefit is meant to replace income when the recipient has to be absent from work in order to care for their child full-time due to serious illness or to accompany them to a specialist healthcare institution.

There was a change in legislation regarding the welfare benefit in 2017, and one of my research questions concerns changes in the size and composition of the recipient population. My dataset contains information regarding regarding each case of benefit reception from Jan 1st 2016 to Dec 31 2021.

I want to portray the development in the amount of work days that have been compensated by the care allowance scheme over time. In many cases a period of care allowance reception can span years. I want to count the number of business days (e.g monday through friday) in the interval from the start date and end date of the reception period that falls within each of the years from 2016 to 2021.

I am only able to get the count of ordinary days for each year. I would be very appreciative of suggestions on how to modify my code so that df$bdays == df$days and the vars(days16:days21) count the number of business days instead.

Update

@Marcus' suggestion works well enough on a small dataset, but takes an unwieldy amount of time to execute on my larger dataset (over an hour and a half). I've come up with a solution using purrr::map2_dbl()

Original code:

library(bizdays)
library(lubridate)
library(dplyr)


id <- sort(sample(1:100, 1000, replace = T))
start_date <- sample(seq(ymd("2016-01-01"), ymd("2021-12-30"), by="day"), 1000)
end_date <- sample(seq(ymd("2016-01-01"), ymd("2021-12-31"), by="day"), 1000)

df <- data.frame(id, start_date, end_date) %>%
  filter(end_date > start_date) %>%
  mutate(interval = interval(start = start_date, end = end_date))



df <- df %>%
  mutate(days16 = as.period(intersect(interval, interval(ymd("2016-01-01"), ymd("2016-12-31"))))%/           
  • Related