I´m having a hard time implementing numba to my function. Basically, I`d like to concatenate to arrays with 22 coloumns, if the new data hasn´t been added yet. If there is no old data, the new data should become a 2d array. The function works fine without the decorator:
@jit(nopython=True)
def add(new,original=np.array([])):
duplicate=True
if original.size!=0:
for raw in original:
for ii in range(11,19):
if raw[ii]!=new[ii]:
duplicate=False
if duplicate==False:
res=np.zeros((original.shape[0] 1,22))
res[:original.shape[0]]=original
res[-1]=new
return res
else:
return original
else:
res=np.zeros((1,22))
res[0]=new
return res
Also if I remove the last part of the code:
else:
res=np.zeros((1,22))
res[0]=new
return res
It would work with njit
So if I ignore the case, that there hasn´t been old data yet, everything would be fine.
FYI: the data I`m passing in is mixed float and np.nan.
Anybody an idea? Thank you so much in advance!
this is my error log:
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<ipython-input-255-d05a5f4ea944> in <module>()
19 return res
20 #add(a,np.array([b]))
---> 21 add(a)
2 frames
/usr/local/lib/python3.7/dist-packages/numba/core/dispatcher.py in _compile_for_args(self, *args, **kws)
413 e.patch_message(msg)
414
--> 415 error_rewrite(e, 'typing')
416 except errors.UnsupportedError as e:
417 # Something unsupported is present in the user code, add help info
/usr/local/lib/python3.7/dist-packages/numba/core/dispatcher.py in error_rewrite(e, issue_type)
356 raise e
357 else:
--> 358 reraise(type(e), e, None)
359
360 argtypes = []
/usr/local/lib/python3.7/dist-packages/numba/core/utils.py in reraise(tp, value, tb)
78 value = tp()
79 if value.__traceback__ is not tb:
---> 80 raise value.with_traceback(tb)
81 raise value
82
TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function getitem>) found for signature:
>>> getitem(float64, int64)
There are 22 candidate implementations:
- Of which 22 did not match due to:
Overload of function 'getitem': File: <numerous>: Line N/A.
With argument(s): '(float64, int64)':
No match.
During: typing of intrinsic-call at <ipython-input-255-d05a5f4ea944> (7)
File "<ipython-input-255-d05a5f4ea944>", line 7:
def add(new,original=np.array([])):
<source elided>
for ii in range(11,19):
if raw[ii]!=new[ii]:
^
Update: Here is how it should work. The function shall cover three main cases
sample input for new data (1d array):
array([9.0000000e 00, 0.0000000e 00, 1.0000000e 00, 0.0000000e 00,
0.0000000e 00, nan, 5.7300000e-01, 9.2605450e-01,
9.3171725e-01, 9.2039175e-01, 9.3450000e-01, 1.6491636e 09,
1.6494228e 09, 1.6496928e 09, 1.6497504e 09, 9.2377000e-01,
9.3738000e-01, 9.3038000e-01, 9.3450000e-01, nan,
nan, nan])
sample input for original data (2d array):
array([[4.00000000e 00, 0.00000000e 00, 1.00000000e 00, 0.00000000e 00,
0.00000000e 00, nan, 5.23000000e-01, 8.31589755e-01,
8.34804877e-01, 8.28374632e-01, 8.36090000e-01, 1.64938320e 09,
1.64966400e 09, 1.64968920e 09, 1.64975760e 09, 8.30750000e-01,
8.38020000e-01, 8.34290000e-01, 8.36090000e-01, nan,
nan, nan]])
- new data will be added and there is no original data
add(new)
Output:
array([[9.0000000e 00, 0.0000000e 00, 1.0000000e 00, 0.0000000e 00,
0.0000000e 00, nan, 5.7300000e-01, 9.2605450e-01,
9.3171725e-01, 9.2039175e-01, 9.3450000e-01, 1.6491636e 09,
1.6494228e 09, 1.6496928e 09, 1.6497504e 09, 9.2377000e-01,
9.3738000e-01, 9.3038000e-01, 9.3450000e-01, nan,
nan, nan]])
- new data will be added, which hasn´t already been added before and there is original data
add(new,original)
Output:
array([[4.00000000e 00, 0.00000000e 00, 1.00000000e 00, 0.00000000e 00,
0.00000000e 00, nan, 5.23000000e-01, 8.31589755e-01,
8.34804877e-01, 8.28374632e-01, 8.36090000e-01, 1.64938320e 09,
1.64966400e 09, 1.64968920e 09, 1.64975760e 09, 8.30750000e-01,
8.38020000e-01, 8.34290000e-01, 8.36090000e-01, nan,
nan, nan],
[9.00000000e 00, 0.00000000e 00, 1.00000000e 00, 0.00000000e 00,
0.00000000e 00, nan, 5.73000000e-01, 9.26054500e-01,
9.31717250e-01, 9.20391750e-01, 9.34500000e-01, 1.64916360e 09,
1.64942280e 09, 1.64969280e 09, 1.64975040e 09, 9.23770000e-01,
9.37380000e-01, 9.30380000e-01, 9.34500000e-01, nan,
nan, nan]])
- new data will be added, which already had been added before
add(new,original)
Output:
array([[9.0000000e 00, 0.0000000e 00, 1.0000000e 00, 0.0000000e 00,
0.0000000e 00, nan, 5.7300000e-01, 9.2605450e-01,
9.3171725e-01, 9.2039175e-01, 9.3450000e-01, 1.6491636e 09,
1.6494228e 09, 1.6496928e 09, 1.6497504e 09, 9.2377000e-01,
9.3738000e-01, 9.3038000e-01, 9.3450000e-01, nan,
nan, nan]])
CodePudding user response:
The main issue is that Numba assumes that original
is a 1D array while this is not the case. The pure-Python code works because the interpreter it never execute the body of the loop for raw in original
but Numba need to compile all the code before its execution. You can solve this problem using the following function prototype:
def add(new,original=np.array([[]])): # Note the `[[]]` instead of `[]`
With that, Numba can deduce correctly that the original
array is a 2D one.
Note that specifying the dimension and types of Numpy arrays and inputs is a good method to avoid such errors and sneaky bugs (eg. due to integer/float truncation).