Minimum absolute difference of a set of numbers-CodePudding

I am given a number set of size n, and a list of Q inputs. Each input will either remove or add a number to this set. After each input, I am supposed to output the minimum absolute difference of the set of numbers.

Constraints:

2 <= N <= 10^6

1 <= Q <= 10^6

-10^9 <= set[i] <= 10^9

Example:

input:

set = [2, 4, 7]
ADD 6
REMOVE 7
REMOVE 4
ADD 2

output:

I am tasked to solve this using an algorithm of time complexity O((N Q)log(N Q)) or better.

My current implementation is not fast enough, but it is as follows:

TreeSet<Integer> tree = new TreeSet<>();
HashMap<Integer, Integer> numberFreq = new HashMap<>();

int dupeCount = 0;

for (int i : set) {
    tree.add(i);
    if (numberFreq.get(i) > 0) dupeCount  ;
    numberFreq.put(i, numberFreq.getOrDefault(i, 0)   1);
}

void add(int i) {
    if (numberFreq.get(i) > 0) dupeCount  ;
    numberFreq.put(i, numberFreq.getOrDefault(i, 0)   1);
    tree.add(i); // if duplicate nothing gets added anyway
   
    if (dupeCount > 0) Console.write(0);
    else {
        int smallestBall = ballTree.first();
        int absDiff;
        int maxAbsDiff = Integer.MAX_VALUE;
        while (ballTree.higher(smallestBall) != null) {
            absDiff = Math.abs(smallestBall - ballTree.higher(smallestBall));
            maxAbsDiff = Math.min(absDiff, maxAbsDiff);
            smallestBall = ballTree.higher(smallestBall);
        }
    Console.write(maxAbsDiff);
    }

}

void remove(int i) {
     if (numberFreq.get(i) > 0) dupeCount--;
     else tree.remove(i);
     numberFreq.put(i, numberFreq.get(i) - 1);

     if (dupeCount > 0) Console.write(0);
     else {
         int smallestBall = ballTree.first();
         int absDiff;
         int maxAbsDiff = Integer.MAX_VALUE;
         while (ballTree.higher(smallestBall) != null) {
             absDiff = Math.abs(smallestBall - ballTree.higher(smallestBall));
             maxAbsDiff = Math.min(absDiff, maxAbsDiff);
             smallestBall = ballTree.higher(smallestBall);
        }
     Console.write(maxAbsDiff);
     }
}

I've tried at it for 2 days now and I'm quite lost.

CodePudding user response：

Here's one algorithm that should work (though I don't know if this is the intended algorithm):

Sort the list of numbers L (if not already sorted): L = [2, 4, 7]
Build a corresponding list D of "sorted adjacent absolute differences" (i.e., the differences between adjacent pairs in the sorted list, sorted themselves in ascending order): D = [2, 3]
For each operation... Suppose the operation is ADD 6, as an example. a) insert the number (6) into L in the correct (sorted) location: L = [2, 4, 6, 7]; b) based on where you inserted it, determine the corresponding adjacent absolute difference that is now obsolete and remove it from D (in this case, the difference 7-4=3 is no longer relevant and can be removed, since 4 and 7 are no longer adjacent with 6 separating them): D = [2]; c) Add in the two new adjacent absolute differences to the correct (sorted) locations (in this case, 6-4=2 and 7-6=1): D = [1, 2, 2]; d) print out the first element of D

If you encounter a remove operation in step 3, the logic is similar but slightly different. You'd find and remove the element from L, remove the two adjacent differences from D that have been made irrelevant by the remove operation, add the new relevant adjacent difference to D, and print the first element of D.

The proof of correctness is straightforward. The minimum adjacent absolute difference will definitely also be the minimum absolute difference, because the absolute difference between two non-adjacent numbers will always be greater than or equal to the absolute difference between two adjacent numbers which lie "between them" in sorted order. This algorithm outputs the minimum adjacent absolute difference after each operation.

You have a few options for the sorted list data structures. But since you want to be able to quickly insert, remove, and read ordered data, I'd suggest something like a self-balancing binary tree. Suppose we use an AVL tree.

Step 1 is O(N log(N)). If the input is an array or something, you could just build an AVL tree; insertion in an AVL tree is log(N), and you have to do it N times to build the tree.

Step 2 is O(N log(N)); you just have to iterate over the AVL tree for L in ascending order, computing adjacent differences as you go, and insert each difference into a new AVL tree for D (again, N insertions each with log(N) complexity).

For a single operation, steps 3a), 3b), 3c), and 3d) are all O(log(N Q)), since they each involve inserting, deleting, or reading one or two elements from an AVL tree of size < N Q. So for a single operation, step 3 is O(log(N Q)). Step 3 repeats this across Q operations, giving you O(Q log(N Q)).

So the final algorithmic runtime complexity is O(N log(N)) O(Q log(N Q)), which is less than O((N Q) log(N Q)).

Edit:

I just realized that the "list of numbers" (L) is actually a set (at least, it is according to the question title, but that might be misleading). Sets don't allow for duplicates. But that's fine either way; whenever inserting, just check if it's a duplicate (after determining where to insert it). If it's a duplicate, the whole operation becomes a no-op. This doesn't change the complexity. Though I suppose that's what a TreeSet does anyways.