Home > Net >  Why doesn't the following code need a python interpreter?
Why doesn't the following code need a python interpreter?

Time:12-23

I was watching a video to learn Numba. At 17:00 the presenter has the following code on screen:

@njit
def simulate_spring_mass_funky_damper(x0, T=10, dt=0.0001, vt=1.0):
    times = np.arange(0, T, dt)
    positions = np.zeros_like(times)
    
    v = 0
    a = 0
    x = x0
    positions[0] = x0/x0
    
    for ii in range(len(times)):
        if ii == 0:
            continue
        t = times[ii]
        a = friction_fn(v, vt) - 100*x
        v = v   a*dt
        x = x   v*dt
        positions[ii] = x/x0
    return times, positions

The presenter then proceeds to instruct numba to release the GIL using jnit(nogil=True). His argument is

This function does not need to access the python interpreter while it is running. In fact, we made sure of that. So we can additionally tell it to release the GIL

The presenter then uses this code with multi threads:

%%time
from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(8) as ex:
    ex.map(simulate_spring_mass_funky_damper, np.arange(0, 1000, 0.1))

I understand that GIL is required by the thread that wishes to interact with the python interpreter. And if a code does not need to interact with python interpreter then isn't it that releasing or not releasing GIL is moot?

  1. I do not understand how this function does not need python interpreter?
  2. If the code does not need to interact with the GIL then what is even the point of releasing the GIL? It won't interact with python interpreter anyway and GIL won't come into play

CodePudding user response:

I do not understand how this function does not need python interpreter?

Numba use a JIT compiler (LLVM-Lite) to translate a Python code to a fast binary that can run in the context of the interpreter. When a function use the @njit decorator (or the equivalent @jit(nopython=True)), Numba generates internally two functions: one is a wrapper that will convert input pure-Python objects to internal native types, and another will perform the actual computation (output values are converted back by the wrapping function then). The thing is objects like CPython lists or dictionaries needs to be protected by the GIL with CPython but internal Numba native values do not. Numba converts CPython integer objects to low-level fixed-size integers, lists/dictionaries and Numpy arrays are typed and are translated to internal data-structures that are independent of CPython. However, this method does not work for all CPython objects: Numba cannot work on CPython objects while releasing the GIL.

The case of Numpy arrays is a bit special since Numpy is designed so the GIL can be released when you work with native array (only with them). Numpy arrays containing CPython objects need the GIL. Note also that Numba do not directly use the Numpy CPython API. It re-implements Numpy internally.

For lists and dictionaries, Numba copy the whole data-structure in another one that do not need the GIL. This is only possible with fully-typed list/dictionaries though. The wrapping function will when convert back the typed lists/dictionaries to CPython ones. This operation is expensive.

Put it shortly: in your case, the wrapping function generated by Numba need the GIL because it works with CPython objects but the actual computing function do need need the GIL. Thus, specifying nogil=True cause the wrapping function to release the GIL before it calls the computing function (and then acquire the GIL back after the computing function has been executed).

If the code does not need to interact with the GIL then what is even the point of releasing the GIL? It won't interact with python interpreter anyway and GIL won't come into play

Well, if you use the @njit decorator (or the equivalent @jit(nopython=True)), then specifying nogil=True is often useless if the code is called sequentially since the GIL cannot be used anyway (since Numba "generates code that does not access the Python C API" in this case according to the documentation).

However, in your case, there are many threads calling the Numba function. If the GIL is not released by the generated wrapping function, then the Numba computing function is executed serially (less efficiently than a sequential code due to expensive contexts switches between threads). Releasing the GIL solve this problem. Note that the wrapping code is still executed serially because of the GIL (which is required because of the CPython object convertion).

  • Related