Sampling non-repeating integers with given probability distribution-CodePudding

I need to sample n different values taken from a set of integers.

These integers should have different occurence probability. E.g. the largest the lukilier.

By using the random package I can sample a set of different values from the set, by maeans of the method

random.sample

However it doesn't seem to provide the possibility to associate a probability distribution.

On the other hand there is the numpy package which allows to associate the distribution, but it returns a sample with repetitions. This can be done with the method

numpy.random.choice

I am looking for a method (or a way around) to do what the two methods do, but together.

CodePudding user response：

You can actually use numpy.random.choice as it has the replace parameter. If set to False, the sampling will be done wihtout remplacement.

Here's a random example:

>>> np.random.choice([1, 2, 4, 6, 9], 3, replace=False, p=[1/2, 1/8, 1/8, 1/8, 1/8])
>>> array([1, 9, 4])

CodePudding user response：

Let size be the size of the sample and p the probability distribution (a normalized list of probabilities).

First generate the size of the sample as follows

n = numpy.random.choice(size, 1, p)[0]

Then create the sample from the desired set of integers as follows:

random.sample(integers, n)