Home > OS >  Find continuous rolling combination of values from a vector that sum to target value
Find continuous rolling combination of values from a vector that sum to target value

Time:02-15

I have counts of hourly salmon passage past an observation tower over the course of a season and found the proportion of the total seasonal passage for each hour.

I want to find the span of hours that account for at least 80% of daily passage. In other words, I need to find the shortest combination of continuous counts which sum to greater than or equal to 0.80.

For example, if I have:

fish <- data.frame(hour = c(0000,   0100,   0200,   0300,   0400,   0500,   0600,   0700,   0800,
                            0900,   1000,   1100,   1200,   1300,   1400,   1500,   1600,   1700,
                            1800,   1900,   2000,   2100,   2200,   2300),
                   prop = c(0.06,   0.05,   0.01,   0.00,   0.00,   0.02,   0.03,   0.02,   0.03,
                            0.02,   0.01,   0.01,   0.00,   0.00,   0.01,   0.03,   0.05,   0.09,
                            0.10,   0.07,   0.07,   0.11,   0.11,   0.11))

I need a function that will return:

   hour prop
1  1600 0.05
2  1700 0.09
3  1800 0.10
4  1900 0.07
5  2000 0.07
6  2100 0.11
7  2200 0.11
8  2300 0.11
9     0 0.06
10  100 0.05

... which is the shortest continuous period (1600 - 0100, 10 hours) which sum to at least 80% of daily passage (.81).

CodePudding user response:

This function iterates over hour extents until the threshold is met.

fish <- data.frame(hour = c(0000,   0100,   0200,   0300,   0400,   0500,   0600,   0700,   0800,
                            0900,   1000,   1100,   1200,   1300,   1400,   1500,   1600,   1700,
                            1800,   1900,   2000,   2100,   2200,   2300),
                   prop = c(0.06,   0.05,   0.01,   0.00,   0.00,   0.02,   0.03,   0.02,   0.03,
                            0.02,   0.01,   0.01,   0.00,   0.00,   0.01,   0.03,   0.05,   0.09,
                            0.10,   0.07,   0.07,   0.11,   0.11,   0.11))

f <- function(dt, thresh) {
  v <- dt$prop
  f2 <- function(n) {
    v1 <- c(v, v[1:n])
    isum <- tail(cumsum(v1), -n) - cumsum(c(0, head(v1, -n - 1)))
    idx <- which.max(isum)
    if (isum[idx] > thresh) return(idx) else integer(0)
  }
  
  ext <- 0L
  idxStart <- which.max(v >= thresh)
  if (v[idxStart] < thresh) idxStart <- integer(0)
  
  while (!length(idxStart)) {
    ext <- ext   1L
    idxStart <- f2(ext)
  }
  
  idxEnd <- idxStart   ext
  
  if (idxEnd > length(v)) {
    return(dt[c(idxStart:length(v), seq(idxEnd %% length(v))),])
  } else {
    return(dt[idxStart:idxEnd,])
  }
}

f(fish, 0.8)
#>    hour prop
#> 17 1600 0.05
#> 18 1700 0.09
#> 19 1800 0.10
#> 20 1900 0.07
#> 21 2000 0.07
#> 22 2100 0.11
#> 23 2200 0.11
#> 24 2300 0.11
#> 1     0 0.06
#> 2   100 0.05
  •  Tags:  
  • r
  • Related