Say I have a vector v=c(10,3,5,1,12,7,9,2)
. For each value, I want to find the number of steps until the "next higher", that is, the next value that's superior to the current value.
For instance, the 1st value is 10, the next higher is 12, and 12 is 4 steps removed from 10. So the first element is associated to 4. Next, we have a 3, which is followed by 5: there's only 1 step until the next higher value. The end result thus should be c(4,1,2,1,NA,1,NA,NA)
, inserting NA whenever there is no "next higher" value: 12 is never beaten, and neither is the final 2 nor the 9 before it.
I can do that with a 'for' loop:
v=c(10,3,5,1,12,7,9,2)
# stop 1 step before the last
n=length(v)-1
#initialize vector
next_higher=vector()
for (i in 1:n) {
# check if the next higher exists: the vector of higher values is non-empty
if (length(which(v[(i 1):(n 1)]>v[i]))==0) {
# if not, insert NA
next_higher=c(next_higher,NA_real_)
} else {
# else, get the index and move on
next_higher=c(next_higher,which(v[(i 1):(n 1)]>v[i])[1])
}
}
# the last one is always going to be NA
next_higher=c(next_higher,NA)
But this is notoriously inefficient and inelegant.
I also tried a recursive function:
find_next_higher = function (x) {
# recursive function
ifelse(length(x)==1,
# if length is 1 there's no next higher
return(NA_real_),
# else check if there is a next higher
ifelse(length(which(x[-1]>x[1]))==0,
# if it doesn't exist, return NA and concatenate, removing the first element
return(c(NA_real_,find_next_higher(x[-1]))),
# if it does, find index and concatenate, removing the first element
return(c(which(x[-1]>x[1])[1],find_next_higher(x[-1])))
)
)
}
But I get a deep recursion problem, it doesn't work for large vector.
What's the cleanest way to do that?
I thought about the apply
function family, or the purrr
library, but failed to find a way to work not on each value individually, but on the remaining v[(n 1):length(v)]
subvector.
Thanks in advance for your suggestions.
CodePudding user response:
We may loop over the sequence of the vector (sapply
), get the position index of the first element of subset of 'v' by comparing with the current element (v[i]
) using which
, subset the first position ([1]
) and return the indexes.
sapply(seq_along(v), \(i) which(v[-(seq_len(i))] > v[i])[1])
[1] 4 1 2 1 NA 1 NA NA
The \(i)
is a compact option for lambda expression in the recent versions of R
. If we have an older R
version, use function(i)
as notified in News 4.1.0
R now provides a shorthand notation for creating functions, e.g. (x) x 1 is parsed as function(x) x 1.
sapply(seq_along(v), function(i) which(v[-(seq_len(i))] > v[i])[1])