I would like to restrict my dataframe my_df
using the following rule: if a province have all dates either before or after 26/09 they should be eliminated, as in desired_df
my_df <- data.frame(Province=c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 4),
date=c("23/09", "24/09", "25/09", "26/09", "27/09", "18/09","21/09", "23/09", "26/09", "29/09", "02/10", "25/09", "26/09", "27/09"))
desired_df <- data.frame(Province=c(1, 1, 1, 1, 1, 4, 4, 4),
date=c("23/09", "24/09", "25/09", "26/09", "27/09", "25/09", "26/09", "27/09"))
CodePudding user response:
Using dplyr
I arbitrarily mutate date2
for filtering date
.
my_df <- my_df %>% mutate(date2=as.Date(date,'%d/%m'))
my_df %>% group_by(Province) %>%
filter(min(date2) < as.Date('2022-09-26'),
max(date2) > as.Date('2022-09-26')) %>%
ungroup() %>%
select(-date2)
Output
# A tibble: 8 x 2
Province date
<dbl> <chr>
1 1 23/09
2 1 24/09
3 1 25/09
4 1 26/09
5 1 27/09
6 4 25/09
7 4 26/09
8 4 27/09