Closest subsequent index for a specified value-CodePudding

Consider a vector:

int = c(1, 1, 0, 5, 2, 0, 0, 2)

I'd like to get the closest subsequent index (not the difference) for a specified value. The first parameter of the function should be the vector, while the second should be the value one wants to see the closest subsequent elements.

For instance,

f(int, 0)
# [1] 2 1 0 2 1 0 0 NA

Here, the first element of the vector (1) is two positions away from the first subsequent 0, (3 - 1 = 2), so it should return 2. Then the second element is 1 position away from a 0 (2 - 1 = 1). When there is no subsequent values that match the specified value, return NA (here it's the case for the last element, because no subsequent value is 0).

Other examples:

f(int, 1)
# [1] 0 0 NA NA NA NA NA NA

f(int, 2) 
# [1] 4 3 2 1 0 2 1 0

f(int, 3) 
# [1] NA NA NA NA NA NA NA NA

This should also work for character vectors:

char = c("A", "B", "C", "A", "A")

f(char, "A") 
# [1] 0 2 1 0 0

CodePudding user response：

Look for the match from nth position to the end of the vector, then get the 1st match:

f <- function(v, x){
  sapply(seq_along(v), function(i){
    which(v[ i:length(v) ] == x)[ 1 ] - 1
  })
}

f(int, 0)
# [1]  2  1  0  2  1  0  0 NA
f(int, 1)
# [1]  0  0 NA NA NA NA NA NA
f(int, 2)
# [1] 4 3 2 1 0 2 1 0
f(int, 3) 
# [1] NA NA NA NA NA NA NA NA

f(char, "A") 
# [1] 0 2 1 0 0

CodePudding user response：

Here f is defined as a recursive function that calls itself over shorter tails of the lookup vector:

f <- function(lookup,val ) {
  ind <- which(lookup == val)[1] -1
  if (length(lookup) > 1) {
    c(ind, f(lookup[-1], val))
  } else {
    ind
  }
}

CodePudding user response：

Find the location of each value (numeric or character)

int = c(1, 1, 0, 5, 2, 0, 0, 2)
value = 0
idx = which(int == value)
## [1] 3 6 7

Expand the index to indicate the nearest value of interest, using an NA after the last value in int.

nearest = rep(NA, length(int))
nearest[1:max(idx)] = rep(idx, diff(c(0, idx))),
## [1]  3  3  3  6  6  6  7 NA

Use simple arithmetic to find the difference between the index of the current value and the index of the nearest value

abs(seq_along(int) - nearest)
## [1]  2  1  0  2  1  0  0 NA

Written as a function

f <- function(x, value) {
    idx = which(x == value)
    nearest = rep(NA, length(x))
    nearest[1:max(idx)] = rep(idx, diff(c(0, idx)))
    abs(seq_along(x) - nearest)
}

We have

> f(int, 0)
[1]  2  1  0  2  1  0  0 NA
> f(int, 1)
[1]  0  0 NA NA NA NA NA NA
> f(int, 2)
[1] 4 3 2 1 0 2 1 0
> f(char, "A")
[1] 0 2 1 0 0
> f(char, "B")
[1]  1  0 NA NA NA
> f(char, "C")
[1]  2  1  0 NA NA

The solution doesn't involve recursion or R-level loops, so should e fast even for long vectors.

CodePudding user response：

Here is an approach using Reduce() and then some fiddling to get the NA values.

f <- function(vec, value) {
replace(
  Reduce(
    function(x, y)
      x    (y * x) ,
    vec != value,
    right = TRUE,
    accumulate = TRUE
  ),
  max(tail(which(vec == value), 1), 0) < seq_along(vec),
  NA
)
}

f(int, 0)          
[1]  2  1  0  2  1  0  0 NA

f(int, 1)          
[1]  0  0 NA NA NA NA NA NA

f(int, 2) 
[1] 4 3 2 1 0 2 1 0

f(int, 3) 
[1] NA NA NA NA NA NA NA NA

char = c("A", "B", "C", "A", "A")

f(char, "A") 
[1] 0 2 1 0 0