Home > front end >  Tally if observations fall in date windows
Tally if observations fall in date windows

Time:12-14

I have a data frame that represents policies with start and end dates. I'm trying to tally the count of policies that are active each month.

library(tidyverse)

ayear <- 2021
amonth <- 10
months <- 12

df <- tibble(
  pol = c(1, 2, 3, 4)
  , bdate = c('2021-02-23', '2019-12-03', '2020-08-11', '2020-12-14')
  , edate = c('2022-02-23', '2020-12-03', '2021-08-11', '2021-06-14')
  )

These four policies have a begin date (bdate) and end date (edate). Beginning in October (amonth) 2021 (ayear) and going back 12 months (months) I'm trying to generate a count of how many of the 4 policies were active at some point in the month to generate a data frame that looks something like this.

Data frame I'm trying to generate would have three columns: month, year, and active_pol_count with 12 rows. Like this.

desired output

CodePudding user response:

library(tidyverse)
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

df <- tibble(
  pol = c(1, 2, 3, 4),
  bdate = c("2021-02-23", "2019-12-03", "2020-08-11", "2020-12-14"),
  edate = c("2022-02-23", "2020-12-03", "2021-08-11", "2021-06-14")
)

# transform star and end date to interval
df <- mutate(df, interval = interval(bdate, edate))

# for every first date of each month between 2020-10 to 2021-10 
seq(as.Date("2020-10-01"), as.Date("2021-09-01"), by = "months") %>%
  tibble(date = .) %>%
  mutate(
    year = year(date),
    month = month(date),
    active_pol_count = date %>% map_dbl(~ .x %within% df$interval %>% sum()),
  )
#> # A tibble: 12 x 4
#>    date        year month active_pol_count
#>    <date>     <dbl> <dbl>            <dbl>
#>  1 2020-10-01  2020    10                2
#>  2 2020-11-01  2020    11                2
#>  3 2020-12-01  2020    12                2
#>  4 2021-01-01  2021     1                2
#>  5 2021-02-01  2021     2                2
#>  6 2021-03-01  2021     3                3
#>  7 2021-04-01  2021     4                3
#>  8 2021-05-01  2021     5                3
#>  9 2021-06-01  2021     6                3
#> 10 2021-07-01  2021     7                2
#> 11 2021-08-01  2021     8                2
#> 12 2021-09-01  2021     9                1

Created on 2021-12-13 by the reprex package (v2.0.1)

  • Related