I have weather data for summarize across different date intervals. Here is the weather data:
library(dplyr)
rainfall_data <- read.csv(text = "
date,rainfall_daily_mm
01/01/2019,0
01/02/2019,1
01/03/2019,3
01/04/2019,45
01/05/2019,0
01/06/2019,0
01/07/2019,0
01/08/2019,43
01/09/2019,5
01/10/2019,0
01/11/2019,55
01/12/2019,6
01/13/2019,0
01/14/2019,7
01/15/2019,0
01/16/2019,7
01/17/2019,8
01/18/2019,89
01/19/2019,65
01/20/2019,3
01/21/2019,0
01/22/2019,0
01/23/2019,2
01/24/2019,0
01/25/2019,0
01/26/2019,0
01/27/2019,0
01/28/2019,22
01/29/2019,3
01/30/2019,0
01/31/2019,0
") %>%
mutate(date = as.Date(date, format = "%d/%m/%Y"))
And here is the date intervals I need to get summaries of from the weather file:
intervals <- read.csv(text= "
treatment,initial,final
A,01/01/2019,01/05/2019
B,01/13/2019,01/20/2019
C,01/12/2019,01/26/2019
D,01/30/2019,01/31/2019
E,01/11/2019,01/23/2019
F,01/03/2019,01/19/2019
G,01/01/2019,01/24/2019
H,01/26/2019,01/28/2019
") %>%
mutate(initial = as.Date(initial, format = "%d/%m/%Y"),
final = as.Date(final, format = "%d/%m/%Y"))
The expected outcome is this one:
This is what I've tried based on a similar question:
summary_by_date_interval <- rainfall_data %>%
mutate(group = cumsum(grepl(intervals$initial|intervals$final, date))) %>%
group_by(group) %>%
summarise(rainfall = sum(rainfall_daily_mm))
And this is the error I got:
Error in `mutate()`:
! Problem while computing `group = cumsum(grepl(intervals$initial |
intervals$final, date))`.
Caused by error in `Ops.Date()`:
! | not defined for "Date" objects
Run `rlang::last_error()` to see where the error occurred.
Any help will be really appreciated.
CodePudding user response:
First %d/%m/%Y
need to be %m/%d/%Y
(or you'll have wrong dates and many NA
's).
Then you could e.g. use lubridate
s interval
and %within%
:
library(dplyr)
library(lubridate)
intervals |>
group_by(treatment) |>
mutate(test = sum(rainfall_data$rainfall_daily_mm[rainfall_data$date %within% interval(initial, final)])) |>
ungroup()
Output:
# A tibble: 8 × 4
treatment initial final test
<chr> <date> <date> <int>
1 A 2019-01-01 2019-01-05 49
2 B 2019-01-13 2019-01-20 179
3 C 2019-01-12 2019-01-26 187
4 D 2019-01-30 2019-01-31 0
5 E 2019-01-11 2019-01-23 242
6 F 2019-01-03 2019-01-19 333
7 G 2019-01-01 2019-01-24 339
8 H 2019-01-26 2019-01-28 22