Home > Net >  How to calculate the time between the first event and the last event in SAS with multiple duplicates
How to calculate the time between the first event and the last event in SAS with multiple duplicates

Time:05-17

I have a dataset with patient data. I want to calculate how many days have passed between a patients first visit and their last to the hospital. I also need a dummy variable (0,1) that takes the value 1 if the observation was the patients first in the dataset (according to the visit date). The dataset looks like:

Patient ID  Visit date
1           2014-04-21
1           2015-01-29
1           2021-04-14
2           2020-01-03
2           2021-07-04
.           .
.           .

What I want:

Patient ID  Visit date  First visit  Difference between first visit and last (in days)
1           2014-04-21  1            0
1           2015-01-29  0            283
1           2021-04-14  0            2550  
2           2020-01-03  1            0
2           2021-07-04  0            548
.           .           .             .
.           .           .             .

If I sort the dataset by Patient ID and Visit date and the run the code: if first.PatientID then do; First_visit = 1; end; Im able to create my dummy variable. I have trouble with calculating the difference in days between first and last visit. Would greatly appreciate any help. Thanks!

CodePudding user response:

You will need to RETAIN a new variable to get the cumulative days. If you use a SUM statement then variable is automatically retained. If you use the DIF() function to get the difference in days between the current and previous date then you can just retain the DAYS variable itself rather than needing to retain the actual first date value.

data have;
  input ID  DATE :yymmdd.;
  format date yymmdd10.;
cards;
1 2014-04-21
1 2015-01-29
1 2021-04-14
2 2020-01-03
2 2021-07-04
;

data want;
  set have;
  by id date;
  first_visit=first.id;
  days   dif(date);
  if first.id then days=0;
run;

A couple of notes.

The FIRST. variable is already coded 1/0 so just use an assignment statement to save its value into a permanent variable.

Do not run the DIF() function conditionally. That will skip adding the current date into the stack to be retrieved the next time. That is why you should add the DIF() to DAYS before forcing DAYS to zero for the first observations for an ID.

Results:

                           first_
Obs    ID          DATE     visit    days

 1      1    2014-04-21       1         0
 2      1    2015-01-29       0       283
 3      1    2021-04-14       0      2550
 4      2    2020-01-03       1         0
 5      2    2021-07-04       0       548

CodePudding user response:

data WANT;
  set HAVE;
  by ID;
  first_visit = first.ID;
  retain firstDate;
  if first.ID then firstDate = Date;
  days = Date - firstDate;
  drop firstDate;
run;
  • Related