I have a following DF (demo). I would like to subset the data only for those IDs present on same Date with different City.
ID | DATE | CITY |
---|---|---|
ABC | 2022-07-08 | AAA |
ABC | 2022-07-22 | BBB |
XYZ | 2022-07-09 | CCC |
XYZ | 2022-07-09 | YYY |
PQR | 2022-09-22 | FFF |
PQR | 2022-09-26 | EEE |
EFG | 2022-10-03 | AAA |
EFG | 2022-10-16 | KKK |
EFG | 2022-10-16 | PPP |
EFG | 2022-10-16 | QQQ |
Desired Output:
ID | DATE | CITY |
---|---|---|
XYZ | 2022-07-09 | CCC |
XYZ | 2022-07-09 | YYY |
EFG | 2022-10-16 | KKK |
EFG | 2022-10-16 | PPP |
EFG | 2022-10-16 | QQQ |
CodePudding user response:
You can group_by
ID and DATE and keep groups that have more than 1 n_distinct
CITY:
library(dplyr)
df %>%
group_by(ID, DATE) %>%
filter(n_distinct(CITY) > 1)
output
# A tibble: 5 × 3
# Groups: ID, DATE [2]
ID DATE CITY
<chr> <chr> <chr>
1 XYZ 2022-07-09 CCC
2 XYZ 2022-07-09 YYY
3 EFG 2022-10-16 KKK
4 EFG 2022-10-16 PPP
5 EFG 2022-10-16 QQQ