Why does Numba skew the timings of a JIT-compiled function?-CodePudding

I'm trying to benchmark a Python function that does list operations using Numba against the CPython interpreter. To compare end-to-end time I used the Linux time utility. time python3.10 list.py

As I understand the first invocation will be expensive due to JIT compilation, but it does not explain why the maximum recorded time is longer than the total time taken to run the entire script.

# list.py
import numpy as np
from time import time, perf_counter 
from numba import njit

@njit
def listOperations():
  list = []
  for i in range(1000):
    list.append(i)
  
  list.sort(reverse=True)
  list.remove(420)
  list.reverse()

if __name__ == "__main__":
    repetitions = 1000
    timings = np.zeros(repetitions)

    for rep in range(repetitions):
        start = time()  # Similar results with perf_counter too.
        listOperations()
        timings[rep] = time() - start

    # Convert to milliseconds
    timings *= 10e3
    print("Mean {}ms, Median {}ms, Std. Dev {}ms, Min {}ms, Max {}ms".format(
            float('%.4f' % np.mean(timings)), 
            float('%.4f' % np.median(timings)), 
            float('%.4f' % np.std(timings)), 
            float('%.4f' % np.min(timings)), 
            float('%.4f' % np.max(timings)))
    )

For Numba it shows maximum of ~66.3s while the time utility reports ~8s. The complete results are below.

'''
Numba --->
Mean 66.8154ms, Median 0.391ms, Std. Dev 2097.7752ms, Min 0.3219ms, Max 66371.1143ms

real  0m7.982s
user  0m8.248s
sys   0m0.100s

CPython3.10 --->
Mean 1.6395ms, Median 1.6284ms, Std. Dev 0.0708ms, Min 1.5759ms, Max 2.3198ms

real. 0m1.115s
user  0m1.468s
sys   0m0.080s 
'''

CodePudding user response：

The main issue is that the compilation time is included in the timings. Indeed, Numba compiles the functions lazily. To prevent this, you must specify the prototype or to execute the first function call outside (which is generally a good practice in benchmarks).

You can use @njit('()') instead of @njit. With this fix, the Numba code is about twice faster on my machine.

Note that your function does not return anything nor read anything in parameter so the JIT can optimize the function to a no-op. To avoid biases, you certainly need to add a parameter, to use it and to return the list. This is apparently not the case on my machine but different versions of Numba may do that.

Note also that Numba list are generally not where Numba shine. Lists are generally slow (both with and without Numba). It is better to use array when the size is known.

By the way, list is a built-in function. Overwriting it can cause sneaky bugs in modules using it (frequent) so this is not a good idea. I advise you to use another name.

Furthermore, note that the standard deviation was pretty big in the results, the median time was good and the maximum time was very big indicating that the timings were not stable and that this instability was due to one slow call. Such results generally indicates the benchmark is flawed or the function itself has an unstable behaviour (typically due to a bug or an initialization done once).