Consolidate a list of members enrolled each month to a range of start and stop dates per member in R-CodePudding

Let's say I have a very tall data frame of all the months each doctor is enrolled:

doctor	month_enrolled
Dr. Brown	March, 2021
Dr. Brown	April, 2021
Dr. Brown	July, 2021
Dr. Brown	August, 2021
Dr. Green	April, 2021
Dr. Green	May, 2021
Dr. Black	Jan, 2021
Dr. Black	May, 2021
Dr. Black	June, 2021
Dr. Black	July, 2021
Dr. Pink	September, 2021
...	...

And I want to consolidate this into a smaller data frame with ranges of enrollment_start and enrollment_end per doctor like this:

doctor	enrollment_start	enrollment_end
Dr. Brown	March, 2021	April, 2021
Dr. Brown	July, 2021	August, 2021
Dr. Green	April, 2021	May, 2021
Dr. Black	Jan, 2021	Jan, 2021
DR. Black	May, 2021	July, 2021
Dr. Pink	September, 2021	September, 2021
...	...	...

Whereby non-continuous enrollment periods are reflected as separate rows.

How might I accomplish this in R, preferably using dplyr / tidyverse syntax?

CodePudding user response：

library(dplyr)
library(lubridate)

data.frame(
  stringsAsFactors = FALSE,
            doctor = c("Dr. Brown","Dr. Brown",
                       "Dr. Brown","Dr. Brown","Dr. Green","Dr. Green",
                       "Dr. Black","Dr. Black","Dr. Black","Dr. Black","Dr. Pink"),
    month_enrolled = c("March, 2021","April, 2021",
                       "July, 2021","August, 2021","April, 2021","May, 2021",
                       "Jan, 2021","May, 2021","June, 2021","July, 2021",
                       "September, 2021")
) %>%
  
transmute(doctor, month = dmy(paste(1,month_enrolled))) %>%
group_by(doctor) %>%
mutate(stint = 1   cumsum(c(0, diff(month) > 31))) %>% # EDIT - shorter
group_by(doctor, stint) %>%
summarize(enrollment_start = min(month),
            enrollment_end = max(month), .groups = "drop")

Result

# A tibble: 6 × 4
  doctor    stint enrollment_start enrollment_end
  <chr>     <dbl> <date>           <date>        
1 Dr. Black     1 2021-01-01       2021-01-01    
2 Dr. Black     2 2021-05-01       2021-07-01    
3 Dr. Brown     1 2021-03-01       2021-04-01    
4 Dr. Brown     2 2021-07-01       2021-08-01    
5 Dr. Green     1 2021-04-01       2021-05-01    
6 Dr. Pink      1 2021-09-01       2021-09-01