I see that Numba does not support Dict-of-Lists ... Thus, I decided to use 2D Numpy arrays instead. This is sad :(
The second problem I have is that I want to create this array on demand. Here is an example:
@nb.njit(parallel=True)
def blah(cond=True):
ary = None
if cond : ary = np.zeros((10000,2))
for i in range(5):
if cond: ary[i] = np.array([i,i])
return 555, ary
The problem is ary
cannot be None
, so I have to allocate the array even if i do not use it.
Is there a way to define ary
without allocating it, so that Numba wont complain?
The 'parallel' seems to cause the problem ??
interesting too that this updates only the first row (i is incremented):
ary[i,:] = np.array([a,b])
but this works
ary[i] = np.array([a,b])
CodePudding user response:
If you want the code to be parallelized, then yes, it absolutely has to be allocated first. You can't have multiple threads trying to resize an array independently.
CodePudding user response:
You can consider allocating a zero size array instead of None. That would make the type identical regardless of the condition being True or False.
@nb.njit(parallel=True)
def blah(cond=True):
if cond:
ary = np.zeros((10000,2))
else:
ary = np.zeros((0,0))
for i in range(5):
if cond:
ary[i ,:] = (i, i)
return ary
blah(cond=False)
It's also a lot faster to avoid initializing a new array in each iteration of the loop (np.array([i, i])
), if that's possible of course.
Explicitly using Numba's prange
for the loop might help a little, but that usually only becomes significant if you do a more intensive computation inside the loop. This example is so simple that the overhead of parallelization is not really worth it.
for i in nb.prange(10000):
if cond:
ary[i, 0] = i
ary[i, 1] = i