Home > Software engineering >  How to get time difference in days since last date (lag) in R using datatable?
How to get time difference in days since last date (lag) in R using datatable?

Time:04-06

person_id  diag_date concept_id  event diff_prev_event
1:         1 2012-01-15    4265600 comorb         NA secs
2:         1 2012-01-15     201820 comorb          0 secs
3:         1 2012-03-15    4265600 comorb    5184000 secs
4:         2 2012-03-15     201820 comorb         NA secs
5:         2 2012-06-22     201820 comorb    8553600 secs
6:         2 2012-06-22    4265600 comorb          0 secs

I am trying to calculate the days since last event for each person. I am running into two issues.

  1. The time difference is showing in seconds. I need to get the days. (5184000 secs = 30 days)
  2. If two days are same dates then the 2nd one is showing 0 when it should be looking at the different date. Row 5 and 6 is same date so they would have the same date difference.

This is the code I tried:

dt[order(diag_date),diff_prev_event := difftime(diag_date, lag( diag_date)), by = c("person_id") ]

CodePudding user response:

Specify the units

library(data.table)
dt[order(diag_date),diff_prev_event := difftime(diag_date, 
     lag( diag_date), units = 'days'), by = c("person_id") ]

Then, we grouped by 'person_id' and 'diag_date' and change the values to the max if there are more than one row

dt[, diff_prev_event := if(.N > 1) max(diff_prev_event, 
    na.rm = TRUE) else diff_prev_event, .(person_id, diag_date)]
> dt
   person_id  diag_date concept_id  event diff_prev_event
       <int>     <Date>      <int> <char>      <difftime>
1:         1 2012-01-15    4265600 comorb          0 days
2:         1 2012-01-15     201820 comorb          0 days
3:         1 2012-03-15    4265600 comorb         60 days
4:         2 2012-03-15     201820 comorb         NA days
5:         2 2012-06-22     201820 comorb         99 days
6:         2 2012-06-22    4265600 comorb         99 days

-output

data

dt <- structure(list(person_id = c(1L, 1L, 1L, 2L, 2L, 2L), diag_date = structure(c(15354, 
15354, 15414, 15414, 15513, 15513), class = "Date"), concept_id = c(4265600L, 
201820L, 4265600L, 201820L, 201820L, 4265600L), event = c("comorb", 
"comorb", "comorb", "comorb", "comorb", "comorb")), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"))
  • Related