Consider the simplest possible function
@numba.jit
def foo(s1):
return s1
Now constructing an array of np.bytes_
objects
> a = np.array(['abc']*5, dtype='S5')
> a
array([b'abc', b'abc', b'abc', b'abc', b'abc'], dtype='|S5')
Why does calling foo
with the vector work:
> foo(a)
array([b'abc', b'abc', b'abc', b'abc', b'abc'], dtype='|S5')
But calling foo
with a single element raises an exception
> foo(a[0])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_8124/2559272744.py in <module>
----> 1 foo(a[0])
TypeError: bad argument type for built-in operation
(This is running numba 0.54.1 from conda-forge on Windows with Python 3.9.7 and numpy 1.20.3)
CodePudding user response:
Neither bytes
nor np.bytes_
types are listed in the set of types supported by numba
as of the latest release. The closest things it supports would be:
- Character sequences (read:
str
) (though it specifically says "no operations are available on them", so this is pretty useless); your function would work if you calledfoo(a[0].decode())
to make it text (but only because it's a pretty useless function) - Actual
numpy
arrays; the cost to view thebytes
/np.bytes_
as annp.array
is pretty low, so you could just do:foo(np.frombuffer(a[0], np.uint8))
and produce something that is more programmatically useful and represents the same data.
CodePudding user response:
The bytes
type is barely supported like the str
type. They are very inefficiently supported and the support is minimalist. Moreover, there are some opened related bugs (like this one. Furthermore, AFAIK, there is no plan to work on this any time soon.
From my understanding, a[0]
returns a numpy.bytes_
-typed object which is not completely compatible with bytes
(at least for Numba). Compiling the function with numpy.bytes_
appear to cause a bug that makes Numba being confused between numpy.bytes_
and bytes
(Numba try to use a compiled function with the wrong type).
Indeed, the following code works:
@numba.jit
def foo(s1):
return s1
foo(b'test') # Works
foo(bytes(a[0])) # Works
The following code fails:
@numba.jit
def foo(s1):
return s1
foo(a[0]) # Fail and cause a bug
foo(bytes(a[0])) # Now fail (do not recompile the function properly)
foo(b'test') # Also fail (do not recompile the function properly)
Note that the bytes
type is only supported in read-only mode.