I was trying to optimize code and wanted to use the .Internal
implementation of vapply
and got an Error I don't understand. (For now I'll take the warning of ?.Internal
seriously, that "Only true R wizards should even consider using this function" and use the user visible vapply
but I'd like to understand the error better nontheless.)
test <- rnorm(10000)
test_l <- as.list(test)
test_2 <- .Internal(vapply(test_l, \(x) x^2, numeric(1), FALSE))
# Error: '...' used in an incorrect context
Compare this to the code of the user visible vapply
:
function (X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)
{
FUN <- match.fun(FUN)
if (!is.vector(X) || is.object(X))
X <- as.list(X)
.Internal(vapply(X, FUN, FUN.VALUE, USE.NAMES))
}
Can someone explain to me why this does not work?
Some of my hypotheses:
Does it run in another namespace than the function body of the user visible vapply
or do I call the user visible function instead of the .Internal
for some other reason?
Is it because the internal vapply
is a special internal, that works differently than other internals?
The R-ints manual [1] states:
2.2 Special internals
There are also special .Internal functions: NextMethod, Recall, withVisible, cbind, rbind (to allow for the deparse.level argument), eapply, lapply and vapply.
But does not give more detail.
Can someone give me more detail on those "Special internals"?
[1] https://cran.r-project.org/doc/manuals/R-ints.html#g_t_002eInternal-vs-_002ePrimitive
CodePudding user response:
The C code refers to R_DotsSymbol
, i.e., the ellipses. The calling scope does not have the ellipses. It can't because it is not a function call.
You can reproduce the error using standard R code like this:
foo <- function() {
return(...)
}
foo()
#Error in foo() : '...' used in an incorrect context
Or with the .Internal
call to vapply
like this:
bar <- function(X, FUN, FUN.VALUE, USE.NAMES) {
.Internal(vapply(X, FUN, FUN.VALUE, USE.NAMES))
}
bar(test_l, \(x) x^2, numeric(1), FALSE)
#Error in bar(test_l, function(x) x^2, numeric(1), FALSE) :
# '...' used in an incorrect context
This works because ...
exists in the calling scope:
baz <- function(X, FUN, FUN.VALUE, USE.NAMES, ...) {
.Internal(vapply(X, FUN, FUN.VALUE, USE.NAMES))
}
x <- baz(test_l, \(x) x^2, numeric(1), FALSE)
You won't be able to produce significant faster code by skipping those first few lines of vapply
. They are not your bottleneck. It might help implementing the function that is repeatedly called by vapply
with Rcpp but a true performance boost can only be achieved by implementing the whole loop with Rcpp. Calls to R closures are expensive and you want to avoid them in loops with many iterations.
CodePudding user response:
Roland's analysis is correct here. There is a hack that allows you to get an ellipsis in the global environment, but it requires the function you pass to vapply
to take an extra unused argument:
`...` <- (function(...) get("..."))(y = 2)
So now you could do:
test <- rnorm(10)
test_l <- as.list(test)
.Internal(vapply(test_l, \(x, y) x^2, numeric(1), FALSE))
#> [1] 1.49370808 0.02969854 4.80764382 2.96895104 0.69506047 1.53488883
#> [7] 0.12566700 1.27180579 0.08399010 0.02366073
However, this is not recommended. Although there is a very small overhead to calling the .Internal
from inside a closure, as Roland says, this is not going to be the rate-limiting factor in your code.
If we measure it:
microbenchmark::microbenchmark(
hack = {`...` <- (function(...) get("..."))(y = 2);
.Internal(vapply(test_l, \(x, y) x^2, numeric(1), FALSE))},
standard = vapply(test_l, \(x) x^2, numeric(1), USE.NAMES = FALSE))
#> Unit: microseconds
#> expr min lq mean median uq max neval cld
#> hack 9.0 9.3 9.690 9.6 9.8 17.8 100 a
#> standard 6.7 7.0 15.261 7.1 7.2 817.9 100 a
We can see that although the hack is slightly faster on average (and only due to the occasional outlier in the standard version), it is in the order of 5 microseconds per call, so you might save yourself 5 milliseconds if you call this routine 1000 times. When you consider the opacity and difficulty of debugging such an approach, it is simply not worth it.
Created on 2022-11-08 with reprex v2.0.2