Creating an array that adds up to 1 with certain conditions in Python-CodePudding

here is the problem that I have to solve in Python.

I have to create an array of random numbers that add up to 1. But, there are a few conditions to consider.

The number of elements in the array are fixed. For example let's take into account a list of size 7.
There are certain elements in this array that needs to be higher or lower than a certain number. Let's say that the second element needs to be higher than 0.4, third element needs to be smaller than 0.2, seventh element needs to be lower than 0.1.
After steps 1 and 2, the sum of the list should be 1.

I do not intend to select the results that satisfy these conditions, instead I want to create generic arrays for each loop that already satisfies the conditions.

The script will tell me how long the list should be and which elements needs to be higher or lower than certain values. The information I can use is an array that looks like this:

higher_than = 0.4
lower_than_1 = 0.2
lower_than_2 = 0.1

array = [1, 2, 0, 1, 1, 1, 0]

Here 2 means that the resulting weight should be higher than "higher_than". The first 0 means that that element in the result should be lower than "lower_than_1", the second 0 means that that element in the result should be lower than "lower_than_2". I will then use this info to come up with a solution to the problem stated above.

I would very much like to hear your insights and solutions to the problem. Thank you in advance.

CodePudding user response：

There are lots of ways you could do this. Here's one.

Start with you're higher_than buckets. If you have more than 2 then there's no solution. Otherwise allocate the floor to each of them (0.4 in your example).

Next, treat each bucket as having a cap. The lower-thans are given. For the higher_thans, use 1.0 less the higher_than floor (so 0.6 here).

Record rand() * cap for each bucket. Rescale these to equal the capacity remaining after the initial allocation to the higher_than buckets, and complete the allocation.

E.g., say we have two buckets with a cap of 0.1, two with a cap of 0.2, and one with a floor of 0.4.

1. allocate weight to the higher_than bucket.
   weights = [0.0, 0.0, 0.0, 0.0, 0.4] (buckets in order I listed them)
2. rand() * cap:
   unscaled allocation = [0.73 * 0.1, 0.24 * 0.1, 0.34 * 0.2, 0.87 * 0.2, 0.33 * 0.6] = [0.073, 0.024, 0.068, 0.174, 0.198]

Now the scaling is a bit tricky because the naive approach, scaling everything by the ratio of the desired to current allocation, risks exceeding a lower_than cap.

Handle this by applying the scaling factor to the array of caps - unscaled allocation. I.e. [0.1 - 0.073, 0.1 - 0.024, 0.2 - 0.068, 0.2 - 0.174, 0.6 - 0.198] = [0.027, 0.076, 0.132, 0.026, 0.402]

We want to use this to scale the unscaled allocation array so that it sums to 0.6. Currently it sums to 0.537, and the caps_less_allocation array sums to 0.663. We need to boost our allocation by 0.6 - 0.537 = 0.063. So we multiply everything in the caps_less_allocation array by 0.063/0.663, then sum up our three arrays:

[0.0,     0.0,     0.0,     0.0,     0.4] - initial weight array
[0.073,   0.024,   0.068,   0.174,   0.198] - unscaled allocation array
[0.00257, 0.00722, 0.01254, 0.00247, 0.03820] - additive scaling factors (rounded)
---------------------------------------------------------------------
[0.07557, 0.03122, 0.08054, 0.17647, 0.63620]

Now we have a random array that meets our constraints and sums to 1.0