Recursion error in sorting algorythm for singly linked lists-CodePudding

I've been trying to write a mergeSort algorithm for singly linked lists in Python and, for bigger lists, in function merge(p1,p2) I keep getting an error about reaching the maximum depth of recursion: "maximum recursion depth exceeded in comparison". I tried to write this function without recursion and it works but I would like to give this thing a try also. I can't find where I made a mistake and if someone would be kind enough to point me in the right direction, I would be pleased.

Node class:

class Node():
    def __init__(self):
        self.data=None
        self.next=None

Merge sort:

def merge(p1,p2): 

    if p2 == None:
        return p1
    if p1 == None:
        return p2
    if p1.data<=p2.dat:
        temp=p1
        temp.next=merge(p1.next,p2)
    else:
        temp=p2
        temp.next=merge(p1,p2.next)
    return temp

def merge_sort(p):

    if p.next == None:
        return p
    mid=get_middle(p)
    h2=mid.next

    mid.next=None

    left=merge_sort(p)
    right=merge_sort(h2)


    output=merge(left,right)

    return output

def get_middle(p):

    if p==None:
        return p
    slow=p
    fast=p.next

    while (fast != None and fast.next != None):
        slow = slow.next
        fast = fast.next.next
    return slow

CodePudding user response：

To check your current recursion limit:

import sys
sys.getrecursionlimit()

You could change the recursion limit, but it is not the way.

Your merge function should not be recursive, as @ggorlen mentioned, it's going linear on stack depth and exceeding 1000. I have tested when the nodes are over 970, an exception will be thrown.

The merge_v2 function can easily sort without recursion, by sliding and comparing 2 heads p1, p2 simultaneously

For example:
p1: 2, 5, 7
p2: 1, 4, 9

head = 1
tail = head
tail reaches 2, 4, 5, 7, 9
return head

Implementation

def assign_smaller(p1, p2):
    if p1.data <= p2.data:
        new_head = p1
        p1 = p1.next
    else:
        new_head = p2
        p2 = p2.next
    
    return new_head, p1, p2

def merge_v2(p1, p2):
    head, p1, p2 = assign_smaller(p1, p2)
    tail = head

    while (p1 is not None and p2 is not None):
        tail.next, p1, p2 = assign_smaller(p1, p2)
        tail = tail.next

    if p1 is not None:
        tail.next = p1

    if p2 is not None:
        tail.next = p2

    return head

The others are kept, except change merge to merge_v2:

class Node():
    def __init__(self):
        self.data=None
        self.next=None

def merge_sort(p):
    if p.next == None:
        return p

    mid=get_middle(p)
    h2=mid.next
    mid.next=None

    left=merge_sort(p)
    right=merge_sort(h2)

    output=merge_v2(left, right)

    return output

def get_middle(p):

    if p==None:
        return p
    slow=p
    fast=p.next

    while (fast != None and fast.next != None):
        slow = slow.next
        fast = fast.next.next
    return slow

Test code:

import random
size = 1000 # adjust here
random_list = random.sample(range(1, 300000), size)

first_node = Node()
current_node = first_node

step = 1
for i in random_list:
    current_node.data = i
    if step == len(random_list):
        current_node.next = None
    else:
        current_node.next = Node()
        current_node = current_node.next
    step =1

print(random_list)
sorted_node = merge_sort(first_node)
sorted_list = []
while True:
    sorted_list.append(sorted_node.data)
    if sorted_node.next is not None:
        sorted_node = sorted_node.next
    else:
        break

print(sorted_list)

CodePudding user response：

This recursive version of merge will deepen the recursion for every single node (temp) that it visits. This means the consumption of the call stack will be of the same order of magnitude as the list itself. This is not a viable solution for bigger lists.

There is no error in your code, it is just that the stack is limited.

As you say you already have an iterative solution, let me tackle this from the viewpoint of tail recursion. Your code is almost a candidate for tail recursion, except that it performs an assignment to temp.next after the recursive call is made. So to have a case of tail recursion, we should make sure the recursive call is the last thing that happens.

Here is how we could make that happen:

def merge(p1, p2, prev):
    if not p1 or not p2:
        prev.next = p1 or p2
    elif p1.data <= p2.data:
        prev.next = p1
        merge(p1.next, p2, p1)
    else:
        prev.next = p2
        merge(p1, p2.next, p2)
    return prev.next

Note that here we provide an extra parameter: it represents the last node that has already been collected in the partially merged list. The initial call to merge in merge_sort should provide a dummy node for this parameter:

    return merge(left, right, Node())

With this change we will still get the recursion error, but now we have a case for tail recursion. So instead of making the recursive call, the function could return to the caller what it should call. That way the current call has finished before the next one is made, and the call stack will not deepen.

That leads to this code:

def merge(p1, p2):
    def mergesub(p1, p2, prev):
        if not p1 or not p2:
            prev.next = p1 or p2
        elif p1.data <= p2.data:
            prev.next = p1
            return p1.next, p2, p1
        else:
            prev.next = p2
            return p1, p2.next, p2

    dummy = Node()
    domore = mergesub(p1, p2, dummy)
    while domore:
        domore = mergesub(*domore)
    return dummy.next

And now the call in merge_sort can be again like it was:

    return merge(left, right)

This time we no longer have recursion. mergesub will return what merge has to call next, and merge will iterate making those calls until the base case is encountered and mergesub will just return None.