Dear stackoverflow community,
Currently, I am attempting to solve an optimisation problem whereby given an array of numbers (integer or float; non-negative) and a positive integer M, output M number of subsets (of any suitable length) such that the subset with the highest sum among the subsets is minimised. The elements in the subsets can be non-contiguous.
For example, given an array of [1, 4, 5, 3] and an integer M = 2, the desired output is [1, 5] and [4, 3], whereby the highest subset sum is 7 which is minimised.
Another example, given an array of [3, 10, 7, 2] and an integer M = 3, the desired output is [3, 2], [7], and [10] or even [3, 7], [2], and [10] where by the minimised highest subset sum is 10.
Is there anyone who have experienced such an optimisation before? I believe this is a variant of the Kadane algorithm.
Any links, pseudo code, pythonic code, and etc. are very much appreciated.
I have thought the following procedure to solve the problem:
- Sort the array in ascending order
- Initialise M number of empty subsets
- In a while loop, add the smallest and largest availableelement to each subset until no more elements are left to be selected from the parent array
CodePudding user response:
This can be viewed as a variant of an assignment problem:
Let v[i]
be our data array and let j=1..M
indicate the subsets. Introduce binary variables x[i,j] ∈ {0,1}
with:
x[i,j] = 1 if value i is assigned to subset j
0 otherwise
Our optimization model can look like:
min z
z >= sum(i, v[i]*x[i,j]) ∀j "largest sum"
sum(j, x[i,j]) = 1 ∀i "assign each value to a subset"
x[i,j] ∈ {0,1}
This is a linear mixed-integer programming problem and can be solved with any MIP solver.
A result using random data:
---- 34 PARAMETER results
value sub1 sub2 sub3 sub4 sub5 sub6 sub7 sub8 sub9 sub10
i1 17.175 17.175
i2 84.327 84.327
i3 55.038 55.038
i4 30.114 30.114
i5 29.221 29.221
i6 22.405 22.405
i7 34.983 34.983
i8 85.627 85.627
i9 6.711 6.711
i10 50.021 50.021
i11 99.812 99.812
i12 57.873 57.873
i13 99.113 99.113
i14 76.225 76.225
i15 13.069 13.069
i16 63.972 63.972
i17 15.952 15.952
i18 25.008 25.008
i19 66.893 66.893
i20 43.536 43.536
i21 35.970 35.970
i22 35.144 35.144
i23 13.149 13.149
i24 15.010 15.010
i25 58.911 58.911
total 114.181 114.582 114.123 112.961 113.949 113.548 114.478 112.809 110.635 113.993
notes:
- The solution is, in general, not unique.
- It is easy to add a constraint that forbids empty subsets. (Can happen with more skewed data sets).