Home > Back-end >  How does the O(n sqrt(n)) algorithm work that lists all possible sums given an array of numbers?
How does the O(n sqrt(n)) algorithm work that lists all possible sums given an array of numbers?

Time:10-24

I came across one algorithm problem in the book Competitive programming handbook https://cses.fi/book/book.pdf: Integer partitions, Knapsack.

The problem is: Given an array of integers whose sum is n. Output all possible sums that can be formed by using a subset of integers.

The dynamic programming solution is understandable. Define dp(x,k) as a bool function that is true when we can use the first k numbers to sum up to x, and false otherwise. Then dp(x,k) = dp(x,k-1) | dp(x-A[k],k-1). The complexity is O(n^2).

However, the book described another solution that leverages one fact that the number of distinct numbers in the array is O(sqrt (n)), and then mentioned a solution by grouping similar numbers together. How does it work? The book seems hard to follow.

CodePudding user response:

I agree that the author should have been more explicit. My reconstruction is that, given an element a and an (n 1)-element bit array indicating which sums can be made by subsets of the previously considered elements, we can calculate in linear time the minimum number of copies of a that we need to make a particular sum, then threshold by how many copies of a we actually have. Python implementation below.

import collections
import math


def subset_sums(lst):
    n = sum(lst)
    dp = [False] * (n   1)
    dp[0] = True
    for a, count in collections.Counter(lst).items():
        dp = [min_count <= count for min_count in min_counts(dp, a)]
    return {x for (x, dp_x) in enumerate(dp) if dp_x}


def min_counts(dp, a):
    dp = [(0 if dp_x else math.inf) for dp_x in dp]
    for x in range(a, len(dp)):
        dp[x] = min(dp[x], dp[x - a]   1)
    return dp


def baseline_subset_sums(lst):
    n = sum(lst)
    dp = [False] * (n   1)
    dp[0] = True
    for a in lst:
        for x in range(len(dp) - 1, a - 1, -1):
            dp[x] = dp[x] or dp[x - a]
    return {x for (x, dp_x) in enumerate(dp) if dp_x}


import random

if __name__ == "__main__":
    for rep in range(100000):
        rand_lst = [random.randrange(20) for i in range(random.randrange(20))]
        assert subset_sums(rand_lst) == baseline_subset_sums(rand_lst), rand_lst
  • Related