Home > Mobile >  How to count the number of observation that fall in quarter based on start and end date?
How to count the number of observation that fall in quarter based on start and end date?

Time:10-27

   patient_id dt_diag_init enroll    
      <int64> <date>       <date>    
 1      10401 2018-01-04   2020-09-30
 2      60701 2019-05-31   2019-09-30
 3     343702 2018-12-05   2020-09-30
 4     472202 2019-12-30   2020-09-30
 5     489502 2019-09-17   2019-11-30
 6     557401 2019-10-15   2020-09-30
 7     857901 2018-01-02   2020-09-30
 8     874201 2018-01-01   2020-09-30
 9    1309102 2019-03-11   2020-09-30
10    1317601 2018-08-14   2020-09-30

I am trying to tally the number of patients per quarter based on the date range (dt_diag_init and enroll). The patient_id 10401 has starts from 2018-01-04 to 2020-09-30 so that patient would be counted in Q1'18, Q2'18....to Q3'20 since the date range overlap. I am trying to get an output table with just the Quarter and the count of patients per quarter.

Sample output table

Qtr    year total   
   <chr> <dbl> <dbl>
 1 Q1'18  2018 485 
 2 Q2'18  2018 516 
 3 Q3'18  2018 560

This is what I tried to create the table with quarter column before doing the tally but I am getting errors:

df_1 <- df %>%
  mutate(quarter = map2(
    as.numeric(dt_diag_init),
    as.numeric(enroll),
    ~ format(seq(.x, .y, by="quarter"), "Q%q'%y")
  ))

Sample date:

df <- structure(list(patient_id = structure(c(5.13877678239481e-320, 
2.99902787682095e-319, 1.69811350606928e-318, 2.33298786097528e-318, 
2.41846121770582e-318, 2.75392685057557e-318, 4.23859411632851e-318, 
4.31912681660064e-318, 6.46782325102068e-318, 6.50981389026072e-318
), class = "integer64"), dt_diag_init = structure(c(17535, 18047, 
17870, 18260, 18156, 18184, 17533, 17532, 17966, 17757), class = "Date"), 
    enroll = structure(c(18535, 18169, 18535, 18535, 18230, 18535, 
    18535, 18535, 18535, 18535), class = "Date")), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

CodePudding user response:

Perhaps this helps

library(dplyr)
library(zoo)
library(tidyr)
library(purrr)
library(lubridate)
df %>%  
   mutate(quarter = map2(dt_diag_init, enroll, 
    ~ as.yearqtr(seq(.x, .y, by = 'quarter')))) %>%
   unnest(quarter) %>%
   mutate(year = year(quarter)) %>% 
   count(quarter, year, name = 'total')

-output

# A tibble: 11 × 3
   quarter    year total
   <yearqtr> <dbl> <int>
 1 2018 Q1    2018     3
 2 2018 Q2    2018     3
 3 2018 Q3    2018     4
 4 2018 Q4    2018     5
 5 2019 Q1    2019     6
 6 2019 Q2    2019     7
 7 2019 Q3    2019     8
 8 2019 Q4    2019     8
 9 2020 Q1    2020     8
10 2020 Q2    2020     8
11 2020 Q3    2020     8
  • Related