I need a way of finding an exact value made of the sum of variables chosen from the population. The algorithm can find just the first solution or all. So we can have 10, 20, or 30 different numbers and we will sum some of them to get a desirable number. As an example we have a population of the below numbers: -2,-1,1,2,3,5,8,10 and we try to get 6 - this can be made of 8 and -2, 1 5 etc. I need at least 2 decimal places to consider as well for accuracy and ideally, the sum of variables will be exact to the asking value.
Thanks for any advice and help on this:)
I build a model Using the simplex method in Excel but I need the solution in Python.
CodePudding user response:
This is the subset sum problem, which is an NP Complete problem.
There is a known pseudo-polynomial solution for it, if the numbers are integers. In your case, you need to consider numbers only to 2nd decimal point, so you could convert the problem into integers by multiplying by 1001, and then run the pseudo-polynomial algorithm.
It will works quite nicely and efficiently - if the range of numbers you have is quite small (Complexity is O(n*W)
, where W
is the sum of numbers in absolute value).
Appendix:
Pseudo polynomial time solution is Dynamic Programming adaptation of the following recursive formula:
k is the desired number
n is the total number of elements in list.
// stop clause: Found a sum
D(k, i) = true | for all 0 <= i < n
// Stop clause: failing attempt, cannot find sum in this branch.
D(x, n) = false | x != k
// Recursive step, either take the current element or skip it.
D(x, i) = D(x arr[i], i 1) OR D(x, i 1)
Start from D(0,0)
If this is not the case, and the range of numbers is quite high, you might have to go with brute force solution, of checking all possible subsets. This solution is of course exponential, and processing it is in O(2^n)
.
(1) Consider rounding if needed, but that's a simple preprocessing that doesn't affect the answer.
CodePudding user response:
This sounded like a fun challenge, so I wasted some time coding it up. This is the dumb approach, going at it totally brute-force, summing up every subset of the input list and comparing the result with our target value.
While we're taking the simple try-every-subset approach, which surely could be greatly improved upon by some more sophisticated algorithm, we are at least doing so very efficiently. We do no copying of elements from the input list into other lists. All we do is extract values from the input list and add them up. All of our manipulations are on integers that have been pulled from the input list. Our way of representing our subsets is very efficient. Each subset is fully described by the input list and a single integer. So all we save to represent each winning combo, is a single integer. So this approach will totally suck for speed, but has to be pretty close to the optimum use of memory.
So here's my solution, including comments that hopefully well describe what it is doing:
data = [-2,-1,1,2,3,5,8,10]
result = []
# We use integers viewed as binary to represent each possible combination of the numbers in our input list. If
# our input list consists of N items, then the number 2**N-1 is the integer that consists of N binary digits, all
# set to 1. So if the input list were [4, 24, -1, 7], then this number would be 2**4-1 = 15 = binary 1111. Now,
# if we iterate over every integer between 0 and the number just described that is based on the length of the input
# list, and for each iteration, we map each binary '1' bit to its place in the input list, and we extract and add
# the value at that position to our accumulator. When we're done, we will have computed the totals for every
# subset of our input list of numbers.
# We use what is described above to iterate over all subsets of the input list looking for subsets that total a
# particular target value. If our total matches the target, we add the number that is the combo of bits for that
# iteration to our results list. When we're done, we have bunch of integers, where each integer is a unique set
# of bits that map to our input list indicating which positions in the list we had to add to our sum to get to
# our target number.
def get_combos_for_target(data, target):
result = []
for combo in range(2**len(data)): # loop over every integer in the range of our bit mask
acc = 0
for i in range(len(data)): # iterate over each index (position) in our input list
if 2**i & combo: # if the bit for the current iteration is set in the current combo...
acc = data[i] # then add the number for that position to our accumulator
if acc == target: # if the final result for this combo is the target number...
result.append(combo) # add that combo to the result
return result
def display_combos(combos):
for combo in combos: # for each of our winning combos
acc = ''
for i in range(len(data)): # iterate over eac index (position) in our input list
if 2 ** i & combo: # if the bit for the current iteration is set in the current combo..
if acc:
acc = ', '
acc = str(data[i]) # add the number at that position to our accumulator string
print(acc) # print the combination
combos = get_combos_for_target(data, 6)
display_combos(combos)
Result (for a target of 6):
1, 2, 3
1, 5
-1, 2, 5
-2, 1, 2, 5
-2, 3, 5
-2, -1, 1, 3, 5
-2, 8
-2, -1, 1, 8
This code will become useless for larger input lists. However, I think it could form a starting point for doing something more clever. I'm already pondering a way I might do this all much more efficiently. I'm thinking it would be some sort of graph thingie that would form the core of the approach.