Home > Net >  R subset dataframe following rule based on date
R subset dataframe following rule based on date

Time:04-25

I would like to restrict my dataframe my_df using the following rule: if a province have all dates either before or after 26/09 they should be eliminated, as in desired_df

my_df <- data.frame(Province=c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 4, 4, 4),
                    date=c("23/09", "24/09", "25/09", "26/09", "27/09", "18/09","21/09", "23/09", "26/09", "29/09", "02/10", "25/09", "26/09", "27/09"))

desired_df <- data.frame(Province=c(1, 1, 1, 1, 1, 4, 4, 4),
                    date=c("23/09", "24/09", "25/09", "26/09", "27/09", "25/09", "26/09", "27/09"))

CodePudding user response:

Using dplyr

I arbitrarily mutate date2 for filtering date.

my_df <- my_df %>% mutate(date2=as.Date(date,'%d/%m'))

my_df %>% group_by(Province) %>% 
  filter(min(date2) < as.Date('2022-09-26'),
         max(date2) > as.Date('2022-09-26')) %>% 
  ungroup() %>% 
  select(-date2)

Output

# A tibble: 8 x 2
  Province date 
     <dbl> <chr>
1        1 23/09
2        1 24/09
3        1 25/09
4        1 26/09
5        1 27/09
6        4 25/09
7        4 26/09
8        4 27/09
  • Related