Home > database >  Use FFT to find all possible fixed-size subset sums
Use FFT to find all possible fixed-size subset sums

Time:09-26

I need to solve the following problem: given an integer sequence x of size N, and a subset size k, find all the possible subset sums. A subset sum is the sum of elements in the subset.

If elements in x are allowed to appear many times (up to k of course) in a subset (sub-multiset), this problem has a pseudo polynomial time solution via FFT. Here is an example:

x = [0, 1, 2, 3, 6]
k = 4
xFrequency = [1, 1, 1, 1, 0, 0, 1] # On the support of [0, 1, 2, 3, 4, 5, 6]
sumFrequency = selfConvolve(xFrequency, times = 4) # A fast approach is to simply raise the power of the Fourier series.
sumFrequency > 0 # Gives a boolean vector indicating all possible size-k subset sums.

But what can be done if an element cannot show up multiple times in a subset?

I came up with the following method but am unsure of its correctness. The idea is to first find the frequencies of sums that are produced by adding at least 2 identical elements:

y = [0, 2, 4, 6, 12] # = [0, 1, 2, 3, 6]   [0, 1, 2, 3, 6] 
yFrequency = [0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1]
sumFrequencyWithRedundancy = convolve(yFrequency, x, x)

My reasoning is that since y represents all possible sums of 2 identical elements, then every sum in y x x is guaranteed to have been produced by adding at least 2 identical elements. Finally

sumFrequencyNoRedundancy = sumFrequency - sumFrequencyWithRedundancy
sumFrequencyNoRedundancy > 0

Any mistake or any other established method for solving the problem?

Thanks!

Edits:

After some tests, it does not work. There turns out to be much more combinations that should be excluded from sumFrequency besides sumFrequencyWithRedundancy, and the combinatoric analyses seem to escalate rapidly with k, eventually making it less efficient than brute-force summation.

My motivation was to find all possible sample sums given sampling without replacement and the sample size. Then I came across the idea of solving the standard subset sum problem via FFT --- free subset size and the qualified subsets themselves unneeded. The reference materials can be easily found online, basically a divide and conquer approach:

  1. Divide the superset into 2 sets, left and right.

  2. Compute all possible subset sums in the left and right sets. The sums are represented by 2 boolean vectors.

  3. Convolve the 2 boolean vectors.

  4. Find if the target sum is indicated in the final boolean vector.

You can see why the algorithm works for the standard subset sum problem.

If anyone can let me know some work on how to find all possible size-k subset sums, I would really appreciate it!

CodePudding user response:

Given k and the n-element array x, it suffices to evaluate the degree-k coefficient in z of the polynomial

   n          x[i]
product (1   y     z).
  i=1

This coefficient is a polynomial in y where the exponents with nonzero coefficients indicate the sums that can be formed using exactly k distinct terms.

One strategy is to split x with reasonably balanced sums, evaluate each half mod z^(k 1), and then multiply using the school algorithm for the outer multiplications and FFT (or whatever) for the inner. This should end up costing roughly O(k^2 S log^2 S).

The idea for evaluating elementary symmetric polynomials efficiently is due to Ben-Or.

  • Related