I have the following data example:
trap_data <- structure(list(site = c(1, 2, 3, 3), trap_date = structure(c(18809,
18809, 18307, 18322), class = "Date")), class = "data.frame", row.names = c(NA,
-4L))
climate <- structure(list(site = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3),
V1 = c(0.1, 0.2, 0.1, 0.1, 0.5, 0.2, 0.3, 0.1, 0.2, 0.1,
0.1, 0.5), V2 = c(1, 1, 3, 1, 2, 3, 3, 1, 1, 3, 1, 2), date = structure(c(18779,
18790, 18789, 18792, 18791, 18790, 18789, 18792, 18305, 18306,
18307, 18308), class = "Date")), class = "data.frame", row.names = c(NA,
-12L))
I would add columns in trap_data that summarize N, mean and sum, of V1 and V2 of climate data, conditioned by site and date. However, I would these new variables (N, mean and sum) are for the N days (these time lengths are variable. for example 20 current days) before the specified date in trap_data.
Thanks,
CodePudding user response:
We may create a new column in the 'trap_data' - 20 days prior date and then do a join
library(data.table)
library(lubridate)
setDT(trap_data)[, prev_date := trap_date - days(20)]
setDT(climate)[trap_data, .(N = .N,
MeanV1 = mean(V1, na.rm = TRUE),
SumV1 = sum(V1, na.rm = TRUE),
MeanV2 = mean(V2, na.rm = TRUE),
SumV2 = sum(V2, na.rm = TRUE)),
on = .(site, date >= prev_date, date <= trap_date), by = .EACHI]
-output
site date date N MeanV1 SumV1 MeanV2 SumV2
<num> <Date> <Date> <int> <num> <num> <num> <num>
1: 1 2021-06-11 2021-07-01 3 0.1333333 0.4 1.666667 5
2: 2 2021-06-11 2021-07-01 4 0.2750000 1.1 2.250000 9
3: 3 2020-01-26 2020-02-15 3 0.1333333 0.4 1.666667 5
4: 3 2020-02-10 2020-03-01 4 0.2250000 0.9 1.750000 7