Home > Mobile >  Numpy Random Choice with Non-regular Array Size
Numpy Random Choice with Non-regular Array Size

Time:11-19

I'm making an array of sums of random choices from a negative binomial distribution (nbd), with each sum being of non-regular length. Right now I implement it as follows:

import numpy
from numpy.random import default_rng
rng = default_rng()

nbd = rng.negative_binomial(1, 0.5, int(1e6))
gmc = [12, 35, 4, 67, 2]
n_pp = np.empty(len(gmc))
for i in range(len(gmc)):
    n_pp[i] = np.sum(rng.choice(nbd, gmc[i]))

This works, but when I perform it over my actual data it's very slow (gmc is of dimension 1e6), and I would like to vary this for multiple values of n and p in the nbd (in this example they're set to 1 and 0.5, respectively).

I'd like to work out a pythonic way to do this which eliminates the loop, but I'm not sure it's possible. I want to keep default_rng for the better random generation than the older way of doing it (np.random.choice), if possible.

CodePudding user response:

The distribution of the sum of m samples from the negative binomial distribution with parameters (n, p) is the negative binomial distribution with parameters (m*n, p). So instead of summing random selections from a large, precomputed sample of negative_binomial(1, 0.5), you can generate your result directly with negative_binomial(gmc, 0.5):

In [68]: gmc = [12, 35, 4, 67, 2]

In [69]: npp = rng.negative_binomial(gmc, 0.5)

In [70]: npp
Out[70]: array([ 9, 34,  1, 72,  7])

(The negative_binomial method will broadcast its inputs, so we can pass gmc as an argument to generate all the samples with one call.)

More generally, if you want to vary the n that is used to generate nbd, you would multiply that n by the corresponding element in gmc and pass the product to rng.negative_binomial.

  • Related