I have counts of hourly salmon passage past an observation tower over the course of a season and found the proportion of the total seasonal passage for each hour.
I want to find the span of hours that account for at least 80% of daily passage. In other words, I need to find the shortest combination of continuous counts which sum to greater than or equal to 0.80.
For example, if I have:
fish <- data.frame(hour = c(0000, 0100, 0200, 0300, 0400, 0500, 0600, 0700, 0800,
0900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700,
1800, 1900, 2000, 2100, 2200, 2300),
prop = c(0.06, 0.05, 0.01, 0.00, 0.00, 0.02, 0.03, 0.02, 0.03,
0.02, 0.01, 0.01, 0.00, 0.00, 0.01, 0.03, 0.05, 0.09,
0.10, 0.07, 0.07, 0.11, 0.11, 0.11))
I need a function that will return:
hour prop
1 1600 0.05
2 1700 0.09
3 1800 0.10
4 1900 0.07
5 2000 0.07
6 2100 0.11
7 2200 0.11
8 2300 0.11
9 0 0.06
10 100 0.05
... which is the shortest continuous period (1600 - 0100, 10 hours) which sum to at least 80% of daily passage (.81).
CodePudding user response:
This function iterates over hour extents until the threshold is met.
fish <- data.frame(hour = c(0000, 0100, 0200, 0300, 0400, 0500, 0600, 0700, 0800,
0900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700,
1800, 1900, 2000, 2100, 2200, 2300),
prop = c(0.06, 0.05, 0.01, 0.00, 0.00, 0.02, 0.03, 0.02, 0.03,
0.02, 0.01, 0.01, 0.00, 0.00, 0.01, 0.03, 0.05, 0.09,
0.10, 0.07, 0.07, 0.11, 0.11, 0.11))
f <- function(dt, thresh) {
v <- dt$prop
f2 <- function(n) {
v1 <- c(v, v[1:n])
isum <- tail(cumsum(v1), -n) - cumsum(c(0, head(v1, -n - 1)))
idx <- which.max(isum)
if (isum[idx] > thresh) return(idx) else integer(0)
}
ext <- 0L
idxStart <- which.max(v >= thresh)
if (v[idxStart] < thresh) idxStart <- integer(0)
while (!length(idxStart)) {
ext <- ext 1L
idxStart <- f2(ext)
}
idxEnd <- idxStart ext
if (idxEnd > length(v)) {
return(dt[c(idxStart:length(v), seq(idxEnd %% length(v))),])
} else {
return(dt[idxStart:idxEnd,])
}
}
f(fish, 0.8)
#> hour prop
#> 17 1600 0.05
#> 18 1700 0.09
#> 19 1800 0.10
#> 20 1900 0.07
#> 21 2000 0.07
#> 22 2100 0.11
#> 23 2200 0.11
#> 24 2300 0.11
#> 1 0 0.06
#> 2 100 0.05