Sorting an array of integers in nlog(n) time without using comparison operators-CodePudding

Imagine there's have an array of integers but you aren't allowed to access any of the values (so no Arr[i] > Arr[i 1] or whatever). The only way to discern the integers from one another is by using a query() function: this function takes a subset of elements as inputs and returns the number of unique integers in this subset. The goal is to partition the integers into groups based on their values — integers in the same group should have the same value, while integers in different groups have different values. The catch - the code has to be O(nlog(n)), or in other words the query() function can only be called O(nlog(n)) times.

I've spent hours optimizing different algorithms in Python, but all of them have been O(n^2). For reference, here's the code I start out with:

n = 100
querycalls = 0
secretarray = [random.randint(0, n-1) for i in range(n)]
def query(items):
    global querycalls
    querycalls  = 1
    return len(set(items))
groups = []

secretarray generates a giant random list of numbers of length n. querycalls keeps track of how much the function is called. groups are where the results go.

The first thing I did was try to create an algorithm based off of merge sort (split the arrays down and then merge based on the query() value) but I could never get it below O(n^2).

CodePudding user response：

Let's say you have an element x and an array of distinct elements, A = [x0, x1, ..., x_{k-1}] and want to know if the x is equivalent to some element in the array and if yes, to which element.

What you can do is a simple recursion (let's call it check-eq):

Check if query([x, A]) == k 1. If yes, then you know that x is distinct from every element in A.
Otherwise, you know that x is equivalent to some element of A. Let A1 = A[:k/2], A2 = A[k/2 1:]. If query([x, A1]) == len(A1), then you know that x is equivalent to some element in A1, so recurse in A1. Otherwise recurse in A2.

This recursion takes at most O(logk) steps. Now, let our initial array be T = [x0, x1, ..., x_{n-1}]. A will be an array of "representative" of the groups of elements. What you do is first take A = [x0] and x = x1. Now use check-eq to see if x1 is in the same group as x0. If no, then let A = [x0, x1]. Otherwise do nothing. Proceed with x = x2. You can see how it goes.

Complexity is of course O(nlogn), because check-eq is called exactly n-1 times and each call take O(logn) time.