Home > front end >  Monthly table filtered by year of a hospital in R
Monthly table filtered by year of a hospital in R

Time:05-22

I work in a hospital and I'm having a problem. I was asked to report the number of patients who were in the hospital (not the number of admissions) in each month filtered by the year in R. How could I do this? Here is an example data frame:

id <-  c("1","2","3","4","5")
hospitalization_date <- c(as.Date("2010-01-01"), as.Date("2011-03-01"),as.Date("2010-04-01"),as.Date("2010-02-01"),as.Date("2011-06-01"))
discharged_date <- c(as.Date("2010-05-11"), as.Date("2011-08-04"),as.Date("2010-06-13"),as.Date("2010-08-02"),as.Date("2011-06-23"))
cid <- c("F29","F33","F71","F29","F09")
gender <- c("F","M","M","M","F")
df <- data.frame(id,hospitalization_date,discharged_date,cid,gender)

I would like the result to be like this, filtering by the year 2010:

   monthly number_of_patients
1  jan                  1
2  feb                  2
3  mar                  2
4  apr                  3
5  may                  3
6  jun                  2
7  jul                  1
8  aug                  1
9  sep                  0
10 oct                  0
11 nov                  0
12 dec                  0

notice that the patient id=1 is in the months jan, feb, mar, apr and may, the patient id=3 is in the months apr, may and jun and the patient id=4 is in the months feb, mar, apr, may , Jun, Jul and Aug all from the year 2010.

and so, filtered by the year 2011:

   monthly number_of_patients
1  jan                  0
2  feb                  0
3  mar                  1
4  apr                  1
5  may                  1
6  jun                  2
7  jul                  1
8  aug                  1
9  sep                  0
10 oct                  0
11 nov                  0
12 dec                  0

Help me please.

CodePudding user response:

Perhaps this helps

library(dplyr)
library(lubridate)
library(tidyr)
library(purrr)
out <- df %>% 
  mutate(across(ends_with('_date'), ymd)) %>%
  transmute(id, 
  dates = map2(hospitalization_date, discharged_date,
      ~ seq(.x, .y, by = 'month'))) %>%
   unnest(dates) %>%
   mutate(year = year(dates), monthly = format(dates, '%b')) %>% 
   count(year, monthly) %>%
   group_by(year) %>% 
   complete(monthly = month.abb, fill = list(n = 0)) %>% 
   arrange(match(monthly, month.abb), .by_group = TRUE) %>%
   ungroup

-output

> as.data.frame(out)
   year monthly n
1  2010     Jan 1
2  2010     Feb 2
3  2010     Mar 2
4  2010     Apr 3
5  2010     May 3
6  2010     Jun 2
7  2010     Jul 1
8  2010     Aug 1
9  2010     Sep 0
10 2010     Oct 0
11 2010     Nov 0
12 2010     Dec 0
13 2011     Jan 0
14 2011     Feb 0
15 2011     Mar 1
16 2011     Apr 1
17 2011     May 1
18 2011     Jun 2
19 2011     Jul 1
20 2011     Aug 1
21 2011     Sep 0
22 2011     Oct 0
23 2011     Nov 0
24 2011     Dec 0

CodePudding user response:

library(tidyverse); library(lubridate)
df %>%
  pivot_longer(contains("date")) %>%
  mutate(chg = ifelse(name == "hospitalization_date", 1, -1),
         month = if_else(chg == -1, 
                         ceiling_date(value, "month"),
                         floor_date(value, "month"))) %>%
  count(month, wt = chg, name = "change") %>%
  complete(month = seq.Date(min(month), max(month), 
           by = "month"), fill = list(change=0)) %>%
  mutate(count = cumsum(change))

Result

        month change count
1  2010-01-01      1     1
2  2010-02-01      1     2
3  2010-03-01      0     2
4  2010-04-01      1     3
5  2010-05-01      0     3
6  2010-06-01     -1     2
7  2010-07-01     -1     1
8  2010-08-01      0     1
9  2010-09-01     -1     0
10 2010-10-01      0     0
11 2010-11-01      0     0
12 2010-12-01      0     0
13 2011-01-01      0     0
14 2011-02-01      0     0
15 2011-03-01      1     1
16 2011-04-01      0     1
17 2011-05-01      0     1
18 2011-06-01      1     2
19 2011-07-01     -1     1
20 2011-08-01      0     1
21 2011-09-01     -1     0
  • Related