Home > Mobile >  R compare dates with group by statement
R compare dates with group by statement

Time:10-27

Using dplyr, I want to do a group by, followed by a date comparison for the following data frame.

df <- data.frame(ID = c(1,1,2,2,3,3,4,4,5,6),
                 X1 = c("A","A","B","C","A","B","B","B","C","A"),
                 X2 = sample(10:30,10,replace = TRUE),
                 dat = as.Date(c("2021-01-01","2021-01-01","2021-02-01","2021-02-01","2021-01-03",
                         "2021-10-05","2021-05-05","2021-05-06","2021-09-14","2021-06-04")))

The group by should be on ID and X1 (X2 can be ignored). So basically, for all IDs with identical values for X1, the dates should be compared and IDs where the dates difference is 1 (positive or negative) or less should be kept. The desired output is:

  ID X1
1  1  A
2  1  A
3  4  B
4  4  B

CodePudding user response:

Grouping by ID and X1 select only those groups that have 2 or more rows and the difference between dates is 1.

You can try -

library(dplyr)

df %>%
  group_by(ID, X1) %>%
  filter(n() >= 2, all(abs(diff(dat)) <= 1)) %>%
  ungroup

#     ID X1       X2 dat       
#  <dbl> <chr> <int> <date>    
#1     1 A        30 2021-01-01
#2     1 A        19 2021-01-01
#3     4 B        24 2021-05-05
#4     4 B        30 2021-05-06

If you are only interested in ID and X1 column add %>% select(ID, X1).

  • Related