I've been trying to write a mergeSort algorithm for singly linked lists in Python and, for bigger lists, in function merge(p1,p2) I keep getting an error about reaching the maximum depth of recursion: "maximum recursion depth exceeded in comparison". I tried to write this function without recursion and it works but I would like to give this thing a try also. I can't find where I made a mistake and if someone would be kind enough to point me in the right direction, I would be pleased.
Node class:
class Node():
def __init__(self):
self.data=None
self.next=None
Merge sort:
def merge(p1,p2):
if p2 == None:
return p1
if p1 == None:
return p2
if p1.data<=p2.dat:
temp=p1
temp.next=merge(p1.next,p2)
else:
temp=p2
temp.next=merge(p1,p2.next)
return temp
def merge_sort(p):
if p.next == None:
return p
mid=get_middle(p)
h2=mid.next
mid.next=None
left=merge_sort(p)
right=merge_sort(h2)
output=merge(left,right)
return output
def get_middle(p):
if p==None:
return p
slow=p
fast=p.next
while (fast != None and fast.next != None):
slow = slow.next
fast = fast.next.next
return slow
CodePudding user response:
To check your current recursion limit:
import sys
sys.getrecursionlimit()
You could change the recursion limit, but it is not the way.
Your merge
function should not be recursive, as @ggorlen mentioned, it's going linear on stack depth and exceeding 1000. I have tested when the nodes are over 970, an exception will be thrown.
The merge_v2
function can easily sort without recursion, by sliding and comparing 2 heads p1, p2 simultaneously
For example:
p1: 2, 5, 7
p2: 1, 4, 9
head = 1
tail = head
tail reaches 2, 4, 5, 7, 9
return head
Implementation
def assign_smaller(p1, p2):
if p1.data <= p2.data:
new_head = p1
p1 = p1.next
else:
new_head = p2
p2 = p2.next
return new_head, p1, p2
def merge_v2(p1, p2):
head, p1, p2 = assign_smaller(p1, p2)
tail = head
while (p1 is not None and p2 is not None):
tail.next, p1, p2 = assign_smaller(p1, p2)
tail = tail.next
if p1 is not None:
tail.next = p1
if p2 is not None:
tail.next = p2
return head
The others are kept, except change merge
to merge_v2
:
class Node():
def __init__(self):
self.data=None
self.next=None
def merge_sort(p):
if p.next == None:
return p
mid=get_middle(p)
h2=mid.next
mid.next=None
left=merge_sort(p)
right=merge_sort(h2)
output=merge_v2(left, right)
return output
def get_middle(p):
if p==None:
return p
slow=p
fast=p.next
while (fast != None and fast.next != None):
slow = slow.next
fast = fast.next.next
return slow
Test code:
import random
size = 1000 # adjust here
random_list = random.sample(range(1, 300000), size)
first_node = Node()
current_node = first_node
step = 1
for i in random_list:
current_node.data = i
if step == len(random_list):
current_node.next = None
else:
current_node.next = Node()
current_node = current_node.next
step =1
print(random_list)
sorted_node = merge_sort(first_node)
sorted_list = []
while True:
sorted_list.append(sorted_node.data)
if sorted_node.next is not None:
sorted_node = sorted_node.next
else:
break
print(sorted_list)
CodePudding user response:
This recursive version of merge
will deepen the recursion for every single node (temp
) that it visits. This means the consumption of the call stack will be of the same order of magnitude as the list itself. This is not a viable solution for bigger lists.
There is no error in your code, it is just that the stack is limited.
As you say you already have an iterative solution, let me tackle this from the viewpoint of tail recursion. Your code is almost a candidate for tail recursion, except that it performs an assignment to temp.next
after the recursive call is made. So to have a case of tail recursion, we should make sure the recursive call is the last thing that happens.
Here is how we could make that happen:
def merge(p1, p2, prev):
if not p1 or not p2:
prev.next = p1 or p2
elif p1.data <= p2.data:
prev.next = p1
merge(p1.next, p2, p1)
else:
prev.next = p2
merge(p1, p2.next, p2)
return prev.next
Note that here we provide an extra parameter: it represents the last node that has already been collected in the partially merged list. The initial call to merge
in merge_sort
should provide a dummy node for this parameter:
return merge(left, right, Node())
With this change we will still get the recursion error, but now we have a case for tail recursion. So instead of making the recursive call, the function could return to the caller what it should call. That way the current call has finished before the next one is made, and the call stack will not deepen.
That leads to this code:
def merge(p1, p2):
def mergesub(p1, p2, prev):
if not p1 or not p2:
prev.next = p1 or p2
elif p1.data <= p2.data:
prev.next = p1
return p1.next, p2, p1
else:
prev.next = p2
return p1, p2.next, p2
dummy = Node()
domore = mergesub(p1, p2, dummy)
while domore:
domore = mergesub(*domore)
return dummy.next
And now the call in merge_sort
can be again like it was:
return merge(left, right)
This time we no longer have recursion. mergesub
will return what merge
has to call next, and merge
will iterate making those calls until the base case is encountered and mergesub
will just return None
.