Home > database >  Time difference check in R
Time difference check in R

Time:05-20

I have a DF_1 that shows the hospital admission date (date and time), the hospital discharge date (date and time) and whether the patient is a hospital readmission less than 30 days. Look:

ID <- c(111,222,222,333,444,444,555,666,1010,1010,1010)
PATIENT_ADMISSION <- c('18/03/2022 15:30','24/03/2022 12:28','27/03/2022 01:38','31/03/2022 08:53','16/04/2022 22:45','22/04/2022 13:15','05/04/2022 05:44','30/03/2022 06:16','10/01/2022 17:30','16/03/2022 22:00','08/04/2022 14:49')
PATIENT_DISCHARGE <- c('01/04/2022 11:20','26/03/2022 12:56','27/03/2022 17:52','01/04/2022 16:15','17/04/2022 12:26','25/04/2022 14:54','05/04/2022 11:44','07/04/2022 05:23','12/01/2022 06:35','06/04/2022 11:35','12/04/2022 12:36')
PATIENT_READMISSION_30D <- c('N','N','Y','N','N','Y','N','N','N','N','Y')
DF_1 <- data.frame(ID,PATIENT_ADMISSION,PATIENT_DISCHARGE,PATIENT_READMISSION_30D)

I want to include one more information in DF_1: I want to know if this readmission (PATIENT_READMISSION_30D = Y) happened within 72 hours. Thus, my DF_1 would have one more variable and would be presented as follows:

ID <- c(111,222,222,333,444,444,555,666,1010,1010,1010)
PATIENT_ADMISSION <- c('18/03/2022 15:30','24/03/2022 12:28','27/03/2022 01:38','31/03/2022 08:53','16/04/2022 22:45','22/04/2022 13:15','05/04/2022 05:44','30/03/2022 06:16','10/01/2022 17:30','16/03/2022 22:00','08/04/2022 14:49')
PATIENT_DISCHARGE <- c('01/04/2022 11:20','26/03/2022 12:56','27/03/2022 17:52','01/04/2022 16:15','17/04/2022 12:26','25/04/2022 14:54','05/04/2022 11:44','07/04/2022 05:23','12/01/2022 06:35','06/04/2022 11:35','12/04/2022 12:36')
PATIENT_READMISSION_30D <- c('N','N','Y','N','N','Y','N','N','N','N','Y')
PATIENT_READMISSION_72H <- c('','','Y','','','N','','','','','Y')
DF_1 <- data.frame(ID,PATIENT_ADMISSION,PATIENT_DISCHARGE,PATIENT_READMISSION_30D,PATIENT_READMISSION_72H)

Therefore, I would like to know how it is possible to check and include this new variable.

CodePudding user response:

You can use difftime to calculate the time difference in hours and dplyr::case_when to satisfy the conditions:

 # ensure proper format for dates
DF_1[2:3] <- lapply(DF_1[2:3], lubridate::dmy_hm)

#DF_1$test <- difftime(DF_1$PATIENT_DISCHARGE, DF_1$PATIENT_ADMISSION, units = "hours")

DF_1 %>% mutate(PATIENT_READMISSION_72H = case_when(
  PATIENT_READMISSION_30D == "N" ~ "",
  difftime(PATIENT_DISCHARGE, PATIENT_ADMISSION, units = "hours") <= 72 ~ "Y",
  difftime(PATIENT_DISCHARGE, PATIENT_ADMISSION, units = "hours") > 72 ~ "N"
))

Output:

#      ID   PATIENT_ADMISSION   PATIENT_DISCHARGE PATIENT_READMISSION_30D PATIENT_READMISSION_72H
# 1   111 2022-03-18 15:30:00 2022-04-01 11:20:00                       N                            
# 2   222 2022-03-24 12:28:00 2022-03-26 12:56:00                       N                            
# 3   222 2022-03-27 01:38:00 2022-03-27 17:52:00                       Y                           Y
# 4   333 2022-03-31 08:53:00 2022-04-01 16:15:00                       N                            
# 5   444 2022-04-16 22:45:00 2022-04-17 12:26:00                       N                            
# 6   444 2022-04-22 13:15:00 2022-04-25 14:54:00                       Y                           N
# 7   555 2022-04-05 05:44:00 2022-04-05 11:44:00                       N                            
# 8   666 2022-03-30 06:16:00 2022-04-07 05:23:00                       N                            
# 9  1010 2022-01-10 17:30:00 2022-01-12 06:35:00                       N                            
# 10 1010 2022-03-16 22:00:00 2022-04-06 11:35:00                       N                            
# 11 1010 2022-04-08 14:49:00 2022-04-12 12:36:00                       Y                           N

Note that line 11 differs from your desired output (a "N" instead of "Y") but that is because in the provided data, the values of PATIENT_DISCHARGE and PATIENT_ADMISSION are > 72 hours apart (93.78333 hours). If there is another reason why this should be "Y" please let me know and I will modify my answer. You can see the times if you un-comment the DF_1$test line in the provided code.

CodePudding user response:

This is a proposal, that uses {lubridate}:


DF_1 %>% 
  mutate(across(c(PATIENT_ADMISSION, PATIENT_DISCHARGE), 
                . %>% as_datetime(format = "%d/%m/%Y %H:%M"))) %>%
  group_by(ID) %>% 
  mutate(
    PATIENT_READMISSION_72H = 
      if_else(difftime(PATIENT_DISCHARGE, PATIENT_ADMISSION, units = "hours") <= 72, "Y", "N")
  ) %>% 
  ungroup() %>% 
  print(width=Inf)
  •  Tags:  
  • r
  • Related