Home > Software design >  I want to calculate the difference in dates based on the entries in another column
I want to calculate the difference in dates based on the entries in another column

Time:01-04

ID Date
1 2022-09-05
1 2022-09-07
1 2022-09-10
2 2022-09-16
2 2022-09-07
  1. The ID's are not unique.
  2. In the case where the ID appears more than twice then get the average of days (i.e for ID = 1 , the outcome will be mean(c(difftime(2022-09-10,2022-09-07,units="days"),mean(c(difftime(2022-09-07,2022-09-05,units="days"))))
  3. In the case where the ID has only 2 entries the just get the difference between the dates

The desired outcome for the above table should be

ID Days
1 2.5 days
2 9 days

CodePudding user response:

library(dplyr)

group_by(df, ID) %>%
  summarise(res=mean(difftime(Date, lag(Date), units="days"), na.rm=TRUE))

# A tibble: 2 × 2
     ID res      
  <dbl> <drtn>   
1     1  2.5 days
2     2 -9.0 days

Data

df <- tibble(ID=c(1,1,1,2,2),
             Date=c(as.Date("2022-09-05"),
                    as.Date("2022-09-07"),
                    as.Date("2022-09-10"),
                    as.Date("2022-09-16"),
                    as.Date("2022-09-07")))

CodePudding user response:

In base R:

aggregate(Date ~ ID, df, \(x) mean(abs(diff(x))))

  ID     Date
1  1 2.5 days
2  2 9.0 days
  •  Tags:  
  • r
  • Related