Home > Mobile >  Random choice out of 1D array with 2-dimensional probability in python
Random choice out of 1D array with 2-dimensional probability in python

Time:05-01

I would like to choose randomly out of a list with 3 elements (HGA, CGA, SGA), but I have 3 lists with the probabilities in it.

My probabilities are given by (the lists have the same length):

Probs = { 'HGA':prob['HGA'], 'CGA':prob['CGA'], 'SGA':prob['SGA'] }

with prob looking like this:

prob['HGA']=[0.5,0.2,0.4,0.6, ...]

and now I want to create another list which should look something like this without using a loop:

particles = ['HGA', 'CGA', 'CGA', 'CGA', 'SGA' ...]

The length of 'particles' should obviously have the same length as the probabilities.

CodePudding user response:

Assuming Probs indicates the probability to select each key (with the sum of values being 1) you can use numpy.random.choice:

Probs = {'HGA':0.1, 'CGA':0.2, 'SGA':0.7}

import numpy as np
particles = np.random.choice(list(Probs), p=list(Probs.values()), size=100)

output:

array(['SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'HGA', 'HGA',
       'HGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'CGA',
       'SGA', 'CGA', 'SGA', 'SGA', 'SGA', 'CGA', 'SGA', 'SGA', 'SGA',
       'HGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA',
       'HGA', 'SGA', 'SGA', 'CGA', 'CGA', 'SGA', 'SGA', 'SGA', 'SGA',
       'SGA', 'CGA', 'CGA', 'CGA', 'SGA', 'CGA', 'SGA', 'CGA', 'CGA',
       'CGA', 'SGA', 'CGA', 'CGA', 'SGA', 'SGA', 'HGA', 'SGA', 'HGA',
       'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'HGA', 'CGA', 'CGA', 'CGA',
       'CGA', 'SGA', 'SGA', 'HGA', 'SGA', 'SGA', 'CGA', 'SGA', 'HGA',
       'SGA', 'SGA', 'SGA', 'SGA', 'CGA', 'SGA', 'CGA', 'CGA', 'SGA',
       'HGA', 'SGA', 'HGA', 'SGA', 'CGA', 'SGA', 'SGA', 'CGA', 'SGA',
       'SGA'], dtype='<U3')

For a list, use:

particles = (np.random.choice(list(Probs), p=list(Probs.values()), size=100)
               .tolist()
             )

CodePudding user response:

If I understood correctly, the i-th element in the probability lists represents the probability of sampling the corresponding item at the i-th step. Meaning that summing the i-th items of all the lists should always give a total of 1. If yes, this should be what you are asking for. I made a toy example:

import numpy as np

Probs = {'HGA':[0.2, 0.6, 0.2], 'CGA':[0.7, 0.1, 0.3], 'SGA':[0.1, 0.3, 0.5]}
values = list(Probs.keys())

particles = [np.random.choice(values, p=sample_probs) for sample_probs in zip(*Probs.values())]

# ['CGA', 'HGA', 'HGA']
print(particles)

For a fast vectorized version, following this excellent answer:

def vectorized_choice(p, n, items=None):
    s = p.cumsum(axis=1)
    r = np.random.rand(p.shape[0], n, 1)
    q = np.expand_dims(s, 1) >= r
    k = q.argmax(axis=-1)
    if items is not None:
        k = np.asarray(items)[k]
    return k

p = np.column_stack(tuple(Probs.values()))
n = 1
items = list(Probs.keys())

sample = vectorized_choice(p, n, items)
  • Related