patient_id dt_diag_init enroll
<int64> <date> <date>
1 10401 2018-01-04 2020-09-30
2 60701 2019-05-31 2019-09-30
3 343702 2018-12-05 2020-09-30
4 472202 2019-12-30 2020-09-30
5 489502 2019-09-17 2019-11-30
6 557401 2019-10-15 2020-09-30
7 857901 2018-01-02 2020-09-30
8 874201 2018-01-01 2020-09-30
9 1309102 2019-03-11 2020-09-30
10 1317601 2018-08-14 2020-09-30
I am trying to tally the number of patients per quarter based on the date range (dt_diag_init and enroll). The patient_id 10401 has starts from 2018-01-04 to 2020-09-30 so that patient would be counted in Q1'18, Q2'18....to Q3'20 since the date range overlap. I am trying to get an output table with just the Quarter and the count of patients per quarter.
Sample output table
Qtr year total
<chr> <dbl> <dbl>
1 Q1'18 2018 485
2 Q2'18 2018 516
3 Q3'18 2018 560
This is what I tried to create the table with quarter column before doing the tally but I am getting errors:
df_1 <- df %>%
mutate(quarter = map2(
as.numeric(dt_diag_init),
as.numeric(enroll),
~ format(seq(.x, .y, by="quarter"), "Q%q'%y")
))
Sample date:
df <- structure(list(patient_id = structure(c(5.13877678239481e-320,
2.99902787682095e-319, 1.69811350606928e-318, 2.33298786097528e-318,
2.41846121770582e-318, 2.75392685057557e-318, 4.23859411632851e-318,
4.31912681660064e-318, 6.46782325102068e-318, 6.50981389026072e-318
), class = "integer64"), dt_diag_init = structure(c(17535, 18047,
17870, 18260, 18156, 18184, 17533, 17532, 17966, 17757), class = "Date"),
enroll = structure(c(18535, 18169, 18535, 18535, 18230, 18535,
18535, 18535, 18535, 18535), class = "Date")), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
Perhaps this helps
library(dplyr)
library(zoo)
library(tidyr)
library(purrr)
library(lubridate)
df %>%
mutate(quarter = map2(dt_diag_init, enroll,
~ as.yearqtr(seq(.x, .y, by = 'quarter')))) %>%
unnest(quarter) %>%
mutate(year = year(quarter)) %>%
count(quarter, year, name = 'total')
-output
# A tibble: 11 × 3
quarter year total
<yearqtr> <dbl> <int>
1 2018 Q1 2018 3
2 2018 Q2 2018 3
3 2018 Q3 2018 4
4 2018 Q4 2018 5
5 2019 Q1 2019 6
6 2019 Q2 2019 7
7 2019 Q3 2019 8
8 2019 Q4 2019 8
9 2020 Q1 2020 8
10 2020 Q2 2020 8
11 2020 Q3 2020 8