With the small reproducible example below, I'd like to identify the dplyr
approach to arrive at the data.frame shown at the end of this note. The features of the dplyr
output is that it will ensure that the data.frame is sorted by date (note that the dates 1999-04-13
and 1999-03-12
are out of order) and that it then "accumulate" the number of days within each wy
grouping (wy
= "water year"; Oct 1-Sep 30) that Q
is above a threshold of 3.0.
dat <- read.table(text="
Date wy Q
1997-01-01 1997 9.82
1997-02-01 1997 3.51
1997-02-02 1997 9.35
1997-10-04 1998 0.93
1997-11-01 1998 1.66
1997-12-02 1998 0.81
1998-04-03 1998 5.65
1998-05-05 1998 7.82
1998-07-05 1998 6.33
1998-09-06 1998 0.55
1998-09-07 1998 4.54
1998-10-09 1999 6.50
1998-12-31 1999 2.17
1999-01-01 1999 5.67
1999-04-13 1999 5.66
1999-03-12 1999 4.67
1999-06-05 1999 3.34
1999-09-30 1999 1.99
1999-11-06 2000 5.75
2000-03-04 2000 6.28
2000-06-07 2000 0.81
2000-07-06 2000 9.66
2000-09-09 2000 9.08
2000-09-21 2000 6.72", header=TRUE)
dat$Date <- as.Date(dat$Date)
mdat <- dat %>%
group_by(wy) %>%
filter(Q > 3) %>%
?
Desired results:
Date wy Q abvThreshCum
1997-01-01 1997 9.82 1
1997-02-01 1997 3.51 2
1997-02-02 1997 9.35 3
1997-10-04 1998 0.93 0
1997-11-01 1998 1.66 0
1997-12-02 1998 0.81 0
1998-04-03 1998 5.65 1
1998-05-05 1998 7.82 2
1998-07-05 1998 6.33 3
1998-09-06 1998 0.55 3
1998-09-07 1998 4.54 4
1998-10-09 1999 6.50 1
1998-12-31 1999 2.17 1
1999-01-01 1999 5.67 2
1999-03-12 1999 4.67 3
1999-04-13 1999 5.66 4
1999-06-05 1999 3.34 5
1999-09-30 1999 1.99 5
1999-11-06 2000 5.75 1
2000-03-04 2000 6.28 2
2000-06-07 2000 0.81 2
2000-07-06 2000 9.66 3
2000-09-09 2000 9.08 4
2000-09-21 2000 6.72 5
CodePudding user response:
library(dplyr)
dat %>%
arrange(Date) %>%
group_by(wy) %>%
mutate(abv = cumsum(Q > 3)) %>%
ungroup()
# # A tibble: 24 x 4
# Date wy Q abv
# <date> <int> <dbl> <int>
# 1 1997-01-01 1997 9.82 1
# 2 1997-02-01 1997 3.51 2
# 3 1997-02-02 1997 9.35 3
# 4 1997-10-04 1998 0.93 0
# 5 1997-11-01 1998 1.66 0
# 6 1997-12-02 1998 0.81 0
# 7 1998-04-03 1998 5.65 1
# 8 1998-05-05 1998 7.82 2
# 9 1998-07-05 1998 6.33 3
# 10 1998-09-06 1998 0.55 3
# # ... with 14 more rows
CodePudding user response:
data.table approach
library(data.table)
setDT(dat, key = "Date")[, abvThreshCum := cumsum(Q > 3), by = .(wy)]