I'm looking for an efficient Python function that randomly allocates an integer across k
items.
That is, some function allocate(n, k)
will produce a k-sized
array of integers summing to n
.
For example, allocate(4, 3)
could produce [4, 0, 0]
, [0, 2, 2]
, [1, 2, 1]
, etc.
CodePudding user response:
This should be faster than your brute-force version when n >> k:
def allocate(n, k):
result = np.zeros(k)
sum_so_far = 0
for ind in range(k-1):
draw = np.random.randint(n - sum_so_far 1)
sum_so_far = draw
result[ind] = draw
result[k-1] = n - sum_so_far
return result
The idea is to draw a random number up to some maximum m
(which starts out equal to n
), and then we subtract that number from the maximum for the next draw, and so on, thus guaranteeing that we will never exceed n
. This way we fill up the first k-1 entries; the final one is filled with whatever is missing to get a sum of exactly n
.
Note: I am not sure whether this results in a "fair" random distribution of values or if it is somehow biased towards putting larger values into earlier indices or something like that.
CodePudding user response:
Here's a brute-force approach:
import numpy as np
def allocate(n, k):
res = np.zeros(k)
for i in range(n):
res[np.random.randint(k)] = 1
return res
for i in range(3):
print(allocate(4, 3))
[0. 3. 1.]
[2. 1. 1.]
[2. 0. 2.]