find nth smallest sum of two squares numbers-CodePudding

It is an algorithm coding: x = a^2 b^2, a,b is positive integers. Find the nth smallest x.

We know that f(1) = 1^2 1^2; f(2) = 1^2 2^2; f(3) = 2^2 2^2....

My idea is update a,b by comparing (a 1)^2 b^2 vs a^2 (b 1)^2. But it is not true, f(4) = 1^2 3^3, which is not update from 2^2 2^2.

Could we find an algorithm better than brute enumeration (O(n^2))?

CodePudding user response：

I don't know what you call brute force, but the problem is probably solvable in time O(n) by exhaustive search.

Consider a square of side 2√n with a corner at the origin, compute all values inside this square, and set a bit in a bit array every time.

This takes a number of operations proportional to (2√n)² = 4n.

For this method to work, one must check if 2√n is large enough to generate the n first Pythagorean sums.

CodePudding user response：

A first algorithm idea would be to use a heap.

Intuitively, you know that the smallest a and b are, the smallest a² b² will be. So:

It's obvious at a glance that 10² 3² < 10² 4²
It's not obvious at a glance whether 10² 3² <? 7² 8²

I suggest adding pairs (a,b) to a min-heap, slowly increasing a and b. Then, the next element in the sequence is the next element in the heap.

In the algorithm below, /2 represents integer division by 2; for instance, 10/2 and 11/2 are both equal to 5.

function nth_sum_of_two_nonzero_squares(n):
    heap <- empty min-heap
    for k in range [2 .. n 1]:
        for a in range [1 .. k/2]:
            push a²   (k-a)² to heap if it's not already in the heap
        next_term <- pop(heap)
    return next_term

Slight improvement: the smallest term added for a given k is for a = k / 2. We can leverage the knowledge of that value to avoid adding more candidates to the heap, as long as the current root of the heap is still smaller than this next candidate.

This gives us:

function nth_sum_of_two_nonzero_squares(n):
    heap <- empty min-heap
    i = 0
    k = 2
    push 1² 1² to heap
    loop forever:
        next_candidate = ((k 1)/2)²   ((k 2)/2)² = ((k 1)² 1)/2
        while next_candidate > peek(heap):
            i <- i   1
            next_term = pop(heap)
            if i == n:
                return next_term
        k <- k   1
        for a in range [1 .. k/2]:
            if a²   (k-a)² not already in heap:
                push a²   (k-a)² to heap
        i <- i 1
        next_term = pop(heap)
        if i == n:
            return next_term

The drawback is that we're adding more and more terms to the heap for every term we pop. Here it looks like the size of the heap is going to be quadratic in k. However, I am convinced that the maximum value of k is in the order of the square root of n. If that is true, then the algorithm uses only linear-time and linear-space.

You might improve on this algorithm by trying to add fewer terms to the heap for every term which is popped. This would require more intuition as to how to sort pairs (a, b) in a pseudo-increasing order of a² b².

Testing in python and comparing with OEIS: Numbers that are the sum of 2 nonzero squares for correctness:

from heapq import heappush, heappop
from itertools import count, islice

def sums_of_two_nonzero_squares():
    heap = [2] # 2 == 1²   1²
    seen = {2}
    for k in count(3):
        next_candidate = (k*k   1)//2
        while heap and heap[0] < next_candidate:
            next_term = heappop(heap)
            yield next_term
            seen.remove(next_term)
        for a in range(1, k//2 1):
            x = a**2 (k-a)**2
            if x not in seen:
                seen.add(x)
                heappush(heap, x)

oeis_groundtruth = [    2, 5, 8, 10, 13, 17, 18, 20, 25, 26, 29, 32, 34, 37, 40, 41, 45, 50, 52, 53, 58, 61, 65, 68, 72, 73, 74, 80, 82, 85, 89, 90, 97, 98, 100, 101, 104, 106, 109, 113, 116, 117, 122, 125, 128, 130, 136, 137, 145, 146, 148, 149, 153, 157, 160, 162, 164, 169, 170, 173, 178]

result = list(islice(sums_of_two_nonzero_squares(), len(oeis_groundtruth)))

print(result)
# [2, 5, 8, 10, 13, 17, 18, 20, 25, 26, 29, 32, 34, 37, 40, 41, 45, 50, 52, 53, 58, 61, 65, 68, 72, 73, 74, 80, 82, 85, 89, 90, 97, 98, 100, 101, 104, 106, 109, 113, 116, 117, 122, 125, 128, 130, 136, 137, 145, 146, 148, 149, 153, 157, 160, 162, 164, 169, 170, 173, 178]

print('result == oeis_groundtruth: ', (result == oeis_groundtruth))
# result == oeis_groundtruth:  True