I would like to choose randomly out of a list with 3 elements (HGA, CGA, SGA), but I have 3 lists with the probabilities in it.
My probabilities are given by (the lists have the same length):
Probs = { 'HGA':prob['HGA'], 'CGA':prob['CGA'], 'SGA':prob['SGA'] }
with prob looking like this:
prob['HGA']=[0.5,0.2,0.4,0.6, ...]
and now I want to create another list which should look something like this without using a loop:
particles = ['HGA', 'CGA', 'CGA', 'CGA', 'SGA' ...]
The length of 'particles' should obviously have the same length as the probabilities.
CodePudding user response:
Assuming Probs
indicates the probability to select each key (with the sum of values being 1) you can use numpy.random.choice
:
Probs = {'HGA':0.1, 'CGA':0.2, 'SGA':0.7}
import numpy as np
particles = np.random.choice(list(Probs), p=list(Probs.values()), size=100)
output:
array(['SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'HGA', 'HGA',
'HGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'CGA',
'SGA', 'CGA', 'SGA', 'SGA', 'SGA', 'CGA', 'SGA', 'SGA', 'SGA',
'HGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'SGA',
'HGA', 'SGA', 'SGA', 'CGA', 'CGA', 'SGA', 'SGA', 'SGA', 'SGA',
'SGA', 'CGA', 'CGA', 'CGA', 'SGA', 'CGA', 'SGA', 'CGA', 'CGA',
'CGA', 'SGA', 'CGA', 'CGA', 'SGA', 'SGA', 'HGA', 'SGA', 'HGA',
'SGA', 'SGA', 'SGA', 'SGA', 'SGA', 'HGA', 'CGA', 'CGA', 'CGA',
'CGA', 'SGA', 'SGA', 'HGA', 'SGA', 'SGA', 'CGA', 'SGA', 'HGA',
'SGA', 'SGA', 'SGA', 'SGA', 'CGA', 'SGA', 'CGA', 'CGA', 'SGA',
'HGA', 'SGA', 'HGA', 'SGA', 'CGA', 'SGA', 'SGA', 'CGA', 'SGA',
'SGA'], dtype='<U3')
For a list, use:
particles = (np.random.choice(list(Probs), p=list(Probs.values()), size=100)
.tolist()
)
CodePudding user response:
If I understood correctly, the i-th element in the probability lists represents the probability of sampling the corresponding item at the i-th step. Meaning that summing the i-th items of all the lists should always give a total of 1. If yes, this should be what you are asking for. I made a toy example:
import numpy as np
Probs = {'HGA':[0.2, 0.6, 0.2], 'CGA':[0.7, 0.1, 0.3], 'SGA':[0.1, 0.3, 0.5]}
values = list(Probs.keys())
particles = [np.random.choice(values, p=sample_probs) for sample_probs in zip(*Probs.values())]
# ['CGA', 'HGA', 'HGA']
print(particles)
For a fast vectorized version, following this excellent answer:
def vectorized_choice(p, n, items=None):
s = p.cumsum(axis=1)
r = np.random.rand(p.shape[0], n, 1)
q = np.expand_dims(s, 1) >= r
k = q.argmax(axis=-1)
if items is not None:
k = np.asarray(items)[k]
return k
p = np.column_stack(tuple(Probs.values()))
n = 1
items = list(Probs.keys())
sample = vectorized_choice(p, n, items)