I came across one algorithm problem in the book Competitive programming handbook https://cses.fi/book/book.pdf: Integer partitions, Knapsack.
The problem is: Given an array of integers whose sum is n. Output all possible sums that can be formed by using a subset of integers.
The dynamic programming solution is understandable. Define dp(x,k)
as a bool function that is true when we can use the first k
numbers to sum up to x
, and false otherwise. Then dp(x,k) = dp(x,k-1) | dp(x-A[k],k-1)
. The complexity is O(n^2).
However, the book described another solution that leverages one fact that the number of distinct numbers in the array is O(sqrt (n)), and then mentioned a solution by grouping similar numbers together. How does it work? The book seems hard to follow.
CodePudding user response:
I agree that the author should have been more explicit. My reconstruction is that, given an element a and an (n 1)-element bit array indicating which sums can be made by subsets of the previously considered elements, we can calculate in linear time the minimum number of copies of a that we need to make a particular sum, then threshold by how many copies of a we actually have. Python implementation below.
import collections
import math
def subset_sums(lst):
n = sum(lst)
dp = [False] * (n 1)
dp[0] = True
for a, count in collections.Counter(lst).items():
dp = [min_count <= count for min_count in min_counts(dp, a)]
return {x for (x, dp_x) in enumerate(dp) if dp_x}
def min_counts(dp, a):
dp = [(0 if dp_x else math.inf) for dp_x in dp]
for x in range(a, len(dp)):
dp[x] = min(dp[x], dp[x - a] 1)
return dp
def baseline_subset_sums(lst):
n = sum(lst)
dp = [False] * (n 1)
dp[0] = True
for a in lst:
for x in range(len(dp) - 1, a - 1, -1):
dp[x] = dp[x] or dp[x - a]
return {x for (x, dp_x) in enumerate(dp) if dp_x}
import random
if __name__ == "__main__":
for rep in range(100000):
rand_lst = [random.randrange(20) for i in range(random.randrange(20))]
assert subset_sums(rand_lst) == baseline_subset_sums(rand_lst), rand_lst