I would like to generate a sparse matrix using a custom discrete distribution.
E.g.:
from scipy.sparse import random
from scipy import stats
xk = np.arange(7)
pk = np.array([0.1, 0.2, 0.3, 0.1, 0.1, 0.1, 0.1])
custm = stats.rv_discrete(name='custm', values=(xk, pk))
rvs = custm.rvs
dens = 0.5
S = random(1, 8, density=dens, data_rvs=rvs)
print(S.A)
I would expect to get something like so:
[4, 5, 0, 3, 0, 1, 0, 0]
The above code works just fine for rvs of a continuous distribution (such as rvs = stats.norm().rvs), but when I use a discrete distribution I am met with this error:
File "Python\Python310\site-packages\scipy\sparse\_construct.py", line 875, in random
vals = data_rvs(k).astype(dtype, copy=False)
AttributeError: 'int' object has no attribute 'astype'
This error persists even if convert the xk array to floats using np.array(xk,dtype=np.float64).
I have considered some work-arounds such as generating array indices in advance and then randomly picking xk values. A bit messy, but do-able. But I suspect I am not understanding about how sparse.random handles random variates?
CodePudding user response:
The docs on rv_discrete.rvs() say this:
size
Defining number of random variates (Default is 1). Note that size has to be given as keyword, not as positional argument.
However, scipy.sparse.random() passes the size as a positional argument:
vals = data_rvs(k).astype(dtype, copy=False)
Therefore, the fix is to define a lambda function which accepts a positional argument and calls rvs() with a keyword argument.
S = random(1, 8, density=dens, data_rvs=lambda k: rvs(size=k))
CodePudding user response:
One option is to define custm
like this:
custm = stats.rv_discrete(name='custm', values=(xk, pk))()
Note the addition of ()
at the end of that line. This causes custm
to be an instance of rv_discrete_frozen
. The first argument of the rvs()
method of this instance is size
, which is what you want.