Why is this code failing a test case [Max Distance]-CodePudding

Problem: Given an array A of integers, find the maximum of j - i subjected to the constraint of A[i] <= A[j].

If there is no solution possible, return -1.

Example :

A : [3 5 4 2] Output : 2 for the pair (3, 4)

INPUT: 9 8 7 -9 -1

EXPECTED OUTPUT: 1

MY OUTPUT: 0

The code I tried runs for all the cases except for the above given input, can anyone explain why is it failing that case and provide me with a rectified version?

My code(Python):

class Solution:
    def maximumGap(self, A):
        # write your method here
        m=-1
        for i in range(len(A)-1):
            j=len(A)-i-1
            if(A[i]<=A[j]):
                m=max(m,j-i)
        return m

I tried using 2 loops and it passes the above case but gives Time Limit Exceed for the other.

        m=-1
        for i in range(len(A)):
            for j in range(len(A)): 
                if(A[i]<=A[j]):
                    m=max(m,j-i)
        return m

CodePudding user response：

You don't need to test every pair. Search from the end until you find an element that's >= the current element, that will be the biggest gap.

As an additional optimization, you can save the j value of that element, and break out of the outer loop when you get past it.

m = -1
maxj = len(A)
for i, el in enumerate(A):
    if i > maxj:
        break
    for j in range(len(A)-1, -1, -1):
        if el <= A[j]:
            m = max(m, j-i)
            maxj = j
            break

CodePudding user response：

Very interesting problem you have! I was inspired by it to implement very fast, almost linear in time solution. My algorithm below uses sorting and has running time dominated by speed of single sorting of whole array, so it has running time of O(N * Log2(N)) where N is the number of elements in array.

Although my algorithm is more complex than other solutions, yet it achieves much faster speeds for bigger N compared to other quadratic solutions that have run time of O(N^2), one such quadratic solution is provided in OP's question.

In my algorithm we do following steps:

Arg-sort input array a. In computer science arg-sort means finding such order of indices that make array sorted. In other words if I have array a then arg-sort finds array of indices sort_idxs such that array a[sort_idxs[i]] is sorted for all 0 <= i < len(a). This step is done through regular sorted() built-in function with provided key = lambda... argument.
Find reverse of arg-sort indices, i.e. find array of indices sort_idxs_rev such that sort_idxs_rev[sort_idxs[i]] = i for all 0 <= i < len(a). This step is linear in time, i.e. takes O(N) time.
Set begin_k and max_dist to both hold value 0. Iterate through all j in range 0 <= j < len(a) backwards. While iterating, do steps 4.-5.. Steps 4.-5. take linear time O(N).
Find minimum (denoted as i) of all sort_idxs[k] for k in range begin_k <= k < sort_idxs_rev[j].
Update max_dist, which is resulting maximal distance, update it to hold new value j - i if it is bigger than previous max_dist. Update begin_k to hold sort_idxs_rev[j] 1.
Output resulting max_dist as answer.

Explanation of algorithm above:

One can observe that if we take rightmost value a[j] then for all a[i] <= a[j] we can update max distance to j - "(minimal such i)" if max distance was smaller (and max distance is 0 at start).

After that we can remove from further computations array elements a[i] such that a[i] <= a[j], because no other smaller j will give bigger distance for all a[i] such that a[i] <= a[j]. If some other j0 such that j0 < j will give bigger distance it would mean that j - min_i < j0 - min_i, hence j < j0 but we took j0 such that j0 < j hence contradiction.

Finding minimal element among i indices takes linear time O(count_i), also as these i are removed from further computation it means that following steps will take O(N - count_i) time, hence total time is O(count_i) O(N - count_i) = O(N).

We can find all smaller than a[j] elements by using previously computed arg-sort and reverse arg-sort indices. Thus finding smaller elements is linear in time.

So each j removes a bunch of a[i] elements smaller than a[j]. Also it updates max distance to biggest possible for such j.

As we iterate all j from right to left it means we observe for each j maximal possible distance for this j. And maximum of all max-distances for all j will be final solution, because if solution existed then it means that it is achieved at some j = j_sol point, but because we observed all j then it means we also observed j = j_sol and its corresponding max distance answer.

At each iteration of j we remove a bunch of a[i], we remove them from further observation. It means at each iteration array becomes shorter and shorter. Each iteration needs linear time O(count_i) to find smallest i where count_i is the number of removed i indices. As each iteration removes same amount count_i and takes time O(count_i) to find minimum hence total running time of j-loop is O(count_i_0) ... O(count_i_N) = O(N), because all count_i equal to N in sum.

Of course actuall removing array elements a[i] will be slow, because Python's list is implemented in such a way that remove element in the middle of a list takes lots of time, actually O(N) time. So in my code below instead of actually removing elements, I just increase begin_k by amount count_i on each iteration, this way I emulate removing elements because to remove elements from sorted array just means to keep some pointer to the beginning of the range, till this pointer everything is considered "as removed", hence I keep such begin_k (which grows gradually by count_i) which signifies a point in sorted array, before this point everything is considered as removed.

So arg-sort takes most of time which is stil very very fast, O(N * Log2(N)), because sorting in Python is implemented in this time. Reverse arg-sort takes O(N). Then total time in loop of j takes O(N) too. Hence total running time is dominated by speed of sorting algorithm.

If input array was really-really huge, like billions of elements then my algorithm will beat all quadratic algorithm of O(N^2) running time. Of course to handle billiions of array elements one has to use C instead of Python. Sorting billions of elements is still fast in C and will take a dozen of seconds.

In my code you can change first line from input_ = '3 5 4 2' to input_ = input() if you want to take input from console. 3 5 4 2 was used in code as a fixed input just for the sake of runnable self-contained example, which every visitor of Stack-Overflow can run without external dependencies. Final answer is printed to console output.

Full code below:

Try it online!

# Input data
#input_ = '9 8 7 -9 -1'
input_ = '3 5 4 2' # input()
a = list(map(int, input_.split()))

# Arg-sort input array
sort_idxs = sorted(range(len(a)), key = lambda i: (a[i], i))

# Compute reverse of arg-sort indices
sort_idxs_rev = [0] * len(a)
for i0, i1 in enumerate(sort_idxs):
    sort_idxs_rev[i1] = i0
    
begin_k = 0
max_dist = 0

# Linearly search for the answer
for j in range(len(a) - 1, -1, -1):
    end_k = sort_idxs_rev[j]
    if begin_k >= end_k:
        continue
    i = min(sort_idxs[k] for k in range(begin_k, end_k))
    max_dist = max(max_dist, j - i)
    begin_k = end_k   1
    if begin_k >= len(a):
        break

# Output answer
print(max_dist)

Input:

3 5 4 2

Output: