Let's say I have a double like 3.5 and I would like to find out where to sort it in an existing sorted vector say seq(1, 10)
, put differently, which index the new number would take in the vector. Of course it sits somewhere between 3 and 4 and hence between the third and fourth index, but what would be the fastet way to arrive at this result?
CodePudding user response:
As mentioned in the comments, findInterval
is fastest for this purpose. Even a very simple loop in C that does the same thing is a little slower on average.
library(Rcpp)
cppFunction("int find_index(double x, NumericVector v) {
int len = v.size();
for(int i = 0; i < len; i) {
if(x <= v[i]) return i 1;
}
return NA_INTEGER;
}")
microbenchmark::microbenchmark(
findInterval = findInterval(453993.5, 1:1000000),
find_index = find_index(453993.5, 1:1000000)
)
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> findInterval 1.9646 2.1739 2.996931 2.32375 2.4846 37.4218 100
#> find_index 2.2151 2.4502 11.319199 2.60925 2.9800 337.9229 100
CodePudding user response:
Something like this?
First define dbl
and my_seq
then concatenate both with c(dbl, my_seq)
and wrap it with sort
then define the index with which(my_vec == dbl)
:
dbl <- 3.5
my_seq <- seq(1,10)
my_vec <- sort(c(dbl, my_seq))
index <- which(my_vec == dbl)
index
output:
[1] 4