About recursive function in merge sort-CodePudding

I was trying to understand the step by step(all the things that goes under the hood and every little detail) of the merge sort algorithm implemented by recursive function. However, I really don't understand what is going on in the following function, which i found from Geeks for geeks. so as far as i know, recursive function are recursively called until the base case is reached and it returns to the caller. however, in this function, the following questions still are on my mind:

what is the base case here?
after the recursive call reaches the base case( in this case I think mergesort(singledigit), aren't we supposed to return something? I don't really know what kind of question I should ask, as the whole idea of recursion is confusing me. please help.

It would make more sense to explain this recursion by going through and explaining the ff example? for example my current understanding is that, let arr=[4,2,7,9,12,3,1,5] then recursive call would be in ff order merge(arr)->merge([4,2,7,9])->merge([4,2])->merge([4]) but since the len(4)<=1, it doesn't get in the if clause of the ff function, which means it does go past the two lines of mergesort() function calls. then it proceeds with merge(2), which will make it the same. finally, there won't be the sorting. this definitely shows that i am misunderstanding the flow of excution. so i would appreciate if someone could go through an example(like use the array above) to explain the step by step of the excution. thank you in advance.

def mergeSort(arr):
    if len(arr) > 1:
       # print("This is funny to be honest",arr[0],len(arr))
        mid = len(arr)//2
        print("This is funny",arr,arr[0],len(arr))
        L = arr[:mid]
        R = arr[mid:]
        mergeSort(L)
        mergeSort(R)
        i = j = k = 0
        while i < len(L) and j < len(R):
            if L[i] < R[j]:
                arr[k] = L[i]
                i  = 1
            else:
                arr[k] = R[j]
                j  = 1
            k  = 1

        # Checking if any element was left
        while i < len(L):
            arr[k] = L[i]
            i  = 1
            k  = 1

        while j < len(R):
            arr[k] = R[j]
            j  = 1
            k  = 1

CodePudding user response：

Nothing happens other than the recursive function splitting arrays into two sub-arrays and calling itself until two sub-arrays of size 1 are produced. At that point, those sub-arrays are merged to create a sorted run of size 2. Then the sort follows the call chain up and down, depth first, left first, merging runs of ever increasing size, until it gets to the initial call where two sorted runs are merged back into the original array.

Note that top down merge sort is mostly academic. Most libraries implement a stable sort as a hybrid of insertion sort and bottom up merge sort, such as Timsort.

CodePudding user response：

mergeSort(arr) does not return any value. It just returns. But it does change its argument array, arr. See all those arr[k] = ... statements? That's what they are doing, placing new elements into the array's entries, one by one.

Now you want to understand the whole chain of function calls, but that's hard. We use recursion precisely so that we don't have to do that, and instead we can solve our problem easily.

The way to understand a recursive function is first of all to assume that it works correctly.

This means we can use it whenever we need to solve a problem of the kind it expects.

This seems circular. After all def solve(x) : { solve(x) } gets us nowhere.

The trick is to use our function to solve smaller subproblems of our original problem. We already "know" it works correctly. We just need to combine the solved subproblems somehow so that our original problem gets solved. This is the non-trivial step.

Specifically, after splitting the array arr into the two halves, L and R, we know both are smaller in size than the original array arr. Thus we "know" that after calling mergeSort(L), L is now sorted. And similarly for R.

And the non-trivial step is the merge. Assuming L and R are already sorted before the while loop, it is easy to see that arr will be filled by its original elements in the sorted order after the while loop has finished.

And that's that.

Or is it? Well, we have the base case to consider, the array arr which is so small that it can't be split into halves. That's no problem though, as such array is already sorted.

Thus the assumption holds for the smallest size arrays, and it also holds for a bigger arrays because the merge works correctly under the assumption that our function works correctly for the smaller sizes.

Thus is works for our input array, whatever its size is.

In other words it works for any array. Thus it works for all of them.

This is known as "induction": A. we prove that a property holds for a base case; B. we prove that the combination step preserves that property; thus C. that property holds for any bigger case which we solve by combining the solutions found for its smaller subproblems which are similar in nature to the original problem:

    solve(X --> R) :  isBase(X) --> R = X.
    solve(X Y --> R) :  solve(X --> A), 
                          solve(Y --> B),
                            combine( (A,B) --> R ).