I am using numpy.bincount
, and I have a vector of indices ind
, and a vector of weights coef
, trying to run np.bincount(ind, coef)
. The problem here is that my weight vector is not of type float64, it is a non-built-in class supporting the arithmetic operator __add__
.
I wonder how I can do this? Directly run the code np.bincount(ind, coef)
gives me an error that
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
The specific type I am considering is LaruentPolynomailRing
from Sagemath.
CodePudding user response:
bincount
is compiled code, so we can't (readily) see what it does; we can only deduce things from the behavior.
The basic count:
In [303]: np.bincount(x)
Out[303]: array([1, 2, 3])
But adapting the weight example, to provide an int
array of weights:
In [304]: #w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights
...: w = np.array([3,5,2,7,10,-6])
...: x = np.array([0, 1, 1, 2, 2, 2])
...: np.bincount(x, weights=w)
Out[304]: array([ 3., 7., 11.])
This is consistent with your error. The result is float, even when weights are int. Weights have been converted to float.
It might do something like this - but compiled code:
In [306]: res = np.zeros(3)
In [307]: for i,v in zip(x,w):
...: res[i] = v
...:
In [308]: res
Out[308]: array([ 3., 7., 11.])
I'm guessing this because it returns a result for each integer value between the x.min
and x.max
. Written like this it just requires w
to have the __add__
. But this kind of iteration on object dtype array is slow, even in compiled code - since it has to use to __add__
of each element object. It can't just zip through the byte data-buffer of the w
array.
Without the consecutive bin value constraint, a defaultdict
is an easy tool for collecting like values.
In [309]: from collections import defaultdict
In [310]: dd = defaultdict(float)
In [311]: for i,v in zip(x,w):
...: dd[i] = v
...:
In [312]: dd
Out[312]: defaultdict(float, {0: 3.0, 1: 7.0, 2: 11.0})
another way -again where x
values are indices in the return array:
In [313]: res = np.zeros(3)
In [315]: np.add.at(res, x, w)
In [316]: res
Out[316]: array([ 3., 7., 11.])
I think all these will work with the objects with __add__
.