Subset sum problem with known subset size and array being a range-CodePudding

I'm trying to find a fast way to solve the subset sum problem with a few modifications, I know the exact size of the subset I need to get the target number and I also know the input array to be a range from 1 to 2000. My questions is if there is any way to improve upon the base subset sum problem solution to make it even faster when knowing these conditions as the normal solutions are too slow. Basically the only changing part is the wanted target sum.

I would preferably want it to return all the possible subsets of the given size that add up to the target value if its possible without slowing the program down too much. An example code in python or a similar language would be appriciated.

I've tried many of the solutions for the base subset sum problem but they are too slow to execute due to the size of the input array.

CodePudding user response：

Knowing the size of the subset is an incredibly powerful information, because you don't have to iterate through subset size.

Given N your subset size, you could just :

Sum up the N first elements of your input array (first subset of size N)
Iterate by substracting the first element of your subarray, and adding the element next to it, which translate to looking at the next subarray
Return the subarray if the sum equals your target number

This should be O(input array size) in time and O(1) in memory, regardless of the initial array content. There is probably a more optimal solution using the range property of your initial array.

Here is an example in C :

void subsetSum(std::vector<int>() array, int subArraySize, int targetNumber)
{
    int sum = 0;
    for (int i = 0; i < subArraySize;   i) // Initial sum
    {
        sum  = array[i];
    }
    for (int i = subArraySize; i < array.size(),   i)
    {
        sum -= array[subArraySize-i];
        sum  = array[i];
        if (sum == targetNumber)
            std::cout << subArraySize-i; // this print the starting position of the subarray
    }
}

CodePudding user response：

First find the contiguous subarray that solves this, or as close to contiguous as we can get. The center of this is going to be target/width if width is odd, or (target-1)/width, (target 1)/width if width is even.

Having found the center, add the same number of neighbors on both sides until you get to the desired width. The rightmost element of the array will need to be shifted further right in cases where there is no contiguous solution.

Ruby code:

def f(target, width)
  arr = []

  # put in center of array
  if width % 2 == 0
    arr.append target / width
    arr.append target / width   1
  else
    arr.append target/width
  end
  
  # repeatedly prepend next smallest integer
  # and append next largest integer
  while arr.length < width
    arr.unshift(arr[0] - 1)
    arr.append(arr[-1]   1)
  end
  
  # increase the last element of the array to match
  # the target sum. This is only necessary if there is no
  # contiguous solution. Because of integer division, 
  # where we need to adjust it will always be to increase
  # the sum of the array.
  arr[-1]  = target - arr.sum
  
  return arr
end

Example run:
> f(12342, 7)
=> [1760, 1761, 1762, 1763, 1764, 1765, 1767]

Note that this code doesn't do any of the work of confirming that a solution exists in the range (1, 2000), but your code should.

So far so fast, but finding all subsets that solve this will be slow because there are many. You can find them by pushing elements to the left and right. in pairs.

Final answer will be the sum over i of: (number of ways of pushing elements to the left by a cumulative i spaces) (number of ways of pushing elements to the right by a cumulative i spaces.

To give a simple example: for a target of 13, width of 3, we start with [3,4,6].

pushes: arrays
0: [3, 4, 6]
1: [2, 4, 7], [2, 5, 6]
2: [1, 4, 8], [1, 5, 7], [2, 3, 8]
3: [1, 3, 9]
4: [1, 2, 10]

... and we're done. There will be a massive number of these, peaking (I think) when the width of the array is half the width of the range, and the initial array is centered in the range.