define Array without allocating it-CodePudding

I see that Numba does not support Dict-of-Lists ... Thus, I decided to use 2D Numpy arrays instead. This is sad :(

The second problem I have is that I want to create this array on demand. Here is an example:

@nb.njit(parallel=True)
def blah(cond=True):
    ary = None
    if cond : ary = np.zeros((10000,2))

    for i in range(5):
        if cond: ary[i] = np.array([i,i])

    return 555, ary

The problem is ary cannot be None, so I have to allocate the array even if i do not use it.

Is there a way to define ary without allocating it, so that Numba wont complain?

The 'parallel' seems to cause the problem ??

interesting too that this updates only the first row (i is incremented):

ary[i,:] = np.array([a,b])

but this works

 ary[i] = np.array([a,b])

CodePudding user response：

If you want the code to be parallelized, then yes, it absolutely has to be allocated first. You can't have multiple threads trying to resize an array independently.

CodePudding user response：

You can consider allocating a zero size array instead of None. That would make the type identical regardless of the condition being True or False.

@nb.njit(parallel=True)
def blah(cond=True):
    
    if cond:
        ary = np.zeros((10000,2))
    else:
        ary = np.zeros((0,0))

    for i in range(5):
        if cond:
            ary[i ,:] = (i, i)

    return ary

blah(cond=False)

It's also a lot faster to avoid initializing a new array in each iteration of the loop (np.array([i, i])), if that's possible of course.

Explicitly using Numba's prange for the loop might help a little, but that usually only becomes significant if you do a more intensive computation inside the loop. This example is so simple that the overhead of parallelization is not really worth it.

for i in nb.prange(10000):
    if cond:
        ary[i, 0] = i
        ary[i, 1] = i