I want to calculate the difference in dates based on the entries in another column-CodePudding

ID	Date
1	2022-09-05
1	2022-09-07
1	2022-09-10
2	2022-09-16
2	2022-09-07

The ID's are not unique.
In the case where the ID appears more than twice then get the average of days (i.e for ID = 1 , the outcome will be mean(c(difftime(2022-09-10,2022-09-07,units="days"),mean(c(difftime(2022-09-07,2022-09-05,units="days"))))
In the case where the ID has only 2 entries the just get the difference between the dates

The desired outcome for the above table should be

ID	Days
1	2.5 days
2	9 days

CodePudding user response：

library(dplyr)

group_by(df, ID) %>%
  summarise(res=mean(difftime(Date, lag(Date), units="days"), na.rm=TRUE))

# A tibble: 2 × 2
     ID res      
  <dbl> <drtn>   
1     1  2.5 days
2     2 -9.0 days

Data

df <- tibble(ID=c(1,1,1,2,2),
             Date=c(as.Date("2022-09-05"),
                    as.Date("2022-09-07"),
                    as.Date("2022-09-10"),
                    as.Date("2022-09-16"),
                    as.Date("2022-09-07")))

CodePudding user response：

In base R:

aggregate(Date ~ ID, df, \(x) mean(abs(diff(x))))

  ID     Date
1  1 2.5 days
2  2 9.0 days