Select a number of items from an ordered list with decreasing probability in Python-CodePudding

I have an ordered list that I want to select items from but with a decreasing probability and for the steepeness of that probability to be able to be changed. I have been able to select the first n number of items easily in Python using:

list = [1,2,3,4,5,6,7,8 ... 46]

subset = list[0:3]

and a random sample from that list as:

list = [1,2,3,4,5,6,7,8 ... 46]

subset = random.sample(list, k =3)

But I'm not sure how to get different probabilities within that range i.e. a greater probability of selecting 1,2 or 3 than compared to 44, 45 and 46 with the probability continually decreasing. This would also be for different subset sizes such as 5, 7, 10 and 16. Any ideas would be appreciated. I want this to be done without replacement.

CodePudding user response：

Numpy's random.choice supports picking samples with weights:

from numpy.random import choice

pop = range(1, 47)

weights = [1/(idx 1) for idx in range(len(pop))]
sw = sum(weights)
weights = [w/sw for w in weights] # weights need to sum to 1

k = 3

print(choice(pop, k, p=weights, replace=False))

You can change the specifics of probabilities by constructing weights list with a different function.

CodePudding user response：

Your question is quite broad, so I have developed quite a broad solution. It's probably not optimal in terms of computation time or storage space, but it does solve the stated problem and it is useful as learning device to understand how such an algorithm might work:

import random as r

def vectorTotal(vec):
    output = 0
    for x in vec:
        output  = x
    return(output)

def reduce(vec):
    output = []
    for x in vec:
        output.append(x/vectorTotal(vec))
    return(output)

xValues = [1,2,3,4,5,6,7,8,9,10]
pValues = [10,6,8,1,8,2,3,9,1,20]

probabilities = reduce(pValues)

def pickWithProb(options, probs):
    rand1 = r.uniform(0,1)
    threshold = 0
    for i in range(len(options)):
        threshold  = probs[i]
        if threshold > rand1:
            return(options[i])
    
    
pickWithProb(xValues, probabilities)

So the thing I called "xValues" are the options, I just put in 1:10 so it was easy to keep track of everything. "pValues" is the relative likelihood of each option being chosen, but it doesn't have to be a valid probability distribution. "Reduce" turns pValues into a valid probability distribution and "vectorTotal" was used to implement "Reduce".

In order to actually use this, it may be helpful to have some kind of function to actually generate pValues, something like:

def generatePValues(xValues):
    output = []
    for x in xValues:
        output.append(2**-x)
    return(output)

CodePudding user response：

This will work but it depends on your desired distribution I guess.


list = [1,2,3,4,5,6,7,8 ... 46]

def get_random_weighted_value(list):
    # Make this value lower to decrease the chance of each value being picked
    probability_value = 0.5
    for value in list:
        if random.random() < probability_value/value:
            return value
        # You can also add values to the probability value here
        # this will make the probabilities closer together
        # i.e.
        probability_value = 0.5   (0.3*value)

This will select a value based on the following probabilities (for probability_value = 0.5):

1: 50%, 2: 25%, 3: 16.75%...