Is running an R code inside a function faster?-CodePudding

The question is: Does R code run faster in a function?

Consider following examples:

> start<-Sys.time()
> for(i in 1:10000){}
> Sys.time()-start
Time difference of 0.01399994 secs
> 
> fn<-function(){
    start<-Sys.time()
    for(i in 1:10000){}
    Sys.time()-start
  }
> fn()
Time difference of 0.00199604 secs



start<-Sys.time()
for(i in 1:10000){x<-100}
Sys.time()-start
Time difference of 0.012995 secs
fn<-function(){
  start<-Sys.time()
  for(i in 1:10000){x<-100}
  Sys.time()-start
}
fn()
Time difference of 0.008996964 secs

The result is the same after increasing number of iterations as shown below:

> sim<-10000000
> start<-Sys.time()
> for(i in 1:sim){x<-i}
> Sys.time()-start
Time difference of 2.832 secs
> 
> fn<-function(){
    start<-Sys.time()
    for(i in 1:sim){x<-i}
    Sys.time()-start
  }
> fn()
Time difference of 2.017997 secs

I would guess it's not a coincidence!

CodePudding user response：

Functions in R are compiled by the JIT compiler. After this happens, most functions will be faster.

As the docs in ?compiler::enableJIT say,

JIT is disabled if the argument is 0. If level is 1 then larger closures are compiled before their first use. If level is 2, then some small closures are also compiled before their second use. If level is 3 then in addition all top level loops are compiled before they are executed. JIT level 3 requires the compiler option optimize to be 2 or 3. The JIT level can also be selected by starting R with the environment variable R_ENABLE_JIT set to one of these values. Calling enableJIT with a negative argument returns the current JIT level. The default JIT level is 3.

So many functions will be faster than top level code.

CodePudding user response：

Credits and "upvotes" go to @user2554330 please for finding out the reason.

To prove the JIT-behaviour I have used this benchmark:

library(microbenchmark)

compiler::enableJIT(0)

fn <- function() {
   for(i in 1:10000) {}
}

microbenchmark(for_loop_without_func = for(i in 1:10000) {},
               for_loop_in_func = fn(),
               times = 100)

The result shows that with disabled JIT the execution time is nearly the same:

Unit: microseconds
                  expr     min       lq     mean   median      uq     max neval
 for_loop_without_func 180.619 180.7990 182.7129 180.9290 181.050 239.489   100
      for_loop_in_func 182.582 182.7075 186.2232 182.7625 182.938 309.912   100

With compiler::enableJIT(3) (which is the default) the function is faster:

Unit: microseconds
                  expr     min       lq      mean   median       uq      max neval
 for_loop_without_func 558.727 574.4875 659.21931 657.3425 702.6475 1984.351   100
      for_loop_in_func  53.019  53.4955  61.59588  53.7260  54.0320  790.632   100