Sampling a random integer 'N' times according to predetermined probabilities, where the pr-CodePudding

Let's say I want to sample 0 with probability p0 = 0.5, 1 with probability p1 = 0.3, or a 2 with probability p2 = 0.2. This is fairly simple to do:

p0 = 0.5
p1 = 0.3
p2 = 0.2
idx = np.random.choice(3, p=[p0, p1, p2])

Now, lets say I want to repeat this process N, each times using different probabilities. Something like:

N = 4
p0 = np.array([0.5, 0.6, 0.7, 0.8])
p1 = np.array([0.3, 0.2, 0.2, 0.1])
p2 = np.array([0.2, 0.2, 0.1, 0.1])
idx = np.empty(N)

for i in range(N):
    idx[i] = np.random.choice(3, p=[p0[i], p1[i], p2[i]])

However, this is obviously slow. Ideally I'd like to do this avoiding loops. Is there a simple solution to this problem?

CodePudding user response：

One way is to generate a uniform random array of size N, compare that to the accumulate probabilities, then take the indexes of the first True value in each column:

cum_probs = np.cumsum([p0,p1,p2],axis=0)
idx = np.argmax(np.random.uniform(size=N) < cum_probs, axis=0)