My data are as follows:
df <- structure(list(site = c("A", "A", "A", "A", "B", "B", "B", "B",
"A", "A", "A", "A", "B", "B", "B", "B"), year = c(2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L), date = c("2019-08-12", "2019-08-14",
"2019-10-16", "2019-10-27", "2019-08-10", "2019-08-13", "2019-10-01",
"2019-10-16", "2020-08-08", "2020-08-11", "2020-10-14", "2020-10-16",
"2020-08-07", "2020-08-19", "2020-10-15", "2020-10-20")),
class = "data.frame", row.names = c(NA,
-16L))
For each site and each year, I would like to subset a date range with the following conditions:
The range will begin from the latest date prior to August 15th, and will end at the earliest date after October 15th.
Thank you in advance!
CodePudding user response:
Grouped by 'site', year', filter
the 'date' that are between
the date
value that is less than August 15, and the 'date' value just greater than the October 15th. date[date < as.Date(ISOdate(year, 8, 15))]
, subsets the dates which are less than the August 15, wrapping with max
returns the latest subset date. Similarly for the earliest date is calculated from October 15th.
library(dplyr)
df %>%
mutate(date = as.Date(date)) %>%
group_by(site, year) %>%
filter(between(date, max(date[date < as.Date(ISOdate(year, 8, 15))]),
min(date[date > as.Date(ISOdate(year, 10, 15))]))) %>%
ungroup