I am sorry that this question is a little elaborate - but I couldn't find a better way of asking this question. Thanks much for all the help!
Following is the method I wrote for merging two sorted linked lists - (MergeTwoLists(ListNode list1, ListNode list2)
). Following is the entire program:
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace LinkedList
{
public class ListNode
{
public int val;
public ListNode next;
public ListNode(int val = 0, ListNode next = null)
{
this.val = val;
this.next = next;
}
}
internal class Program
{
static void Main(string[] args)
{
ListNode list1= new ListNode(1, new ListNode(2, new ListNode(4, null)));
ListNode list2 = new ListNode(1, new ListNode(3, new ListNode(4, null)));
ListNode list3 = MergeTwoLists(list1,list2);
}
public static ListNode MergeTwoLists(ListNode list1, ListNode list2)
{
if (list1 == null)
return list2;
else if (list2 == null)
return list1;
ListNode **l3 = new ListNode(0,null)**;
ListNode **l4 = l3**;
while (list1 != null && list2 != null)//list1 = 1->2->4->null
{ //list2 = 1->3->4->null
if (list1.val <= list2.val)
{
l4.next = list1;
list1 = list1.next;//after this statement, list1 now = 2->4->null.
}
else
{
l4.next = list2;
list2 = list2.next;
}
**l4 = l4.next**;//this should cause a problem in merging two lists, but doesn't.
}
l4.next=list1==null?list2:list1;
return l3.next;
}
}
}
Now, here is my question -
There are three bolded lines in the method - MergeTwoLists
, above (stackoverflow does not allow bolding a code snippet, so the bolded code snippet was surrounded by double asterisx).
The three bolded lines in the method - MergeTwoLists
- are:
- l3 = new ListNode(0,null)
- l4 = l3
- l4 = l4.next
In line 1. above, l3
reference variable is on the stack, and points to the ListNode
object 0->null
, on the heap.
In line 2. above, l4
reference variable is on the stack, and is assigned the l3
reference variable. So both, l4
and l3
point to the same ListNode
object on the heap - 0->null
.
Line 3. above should cause a problem in merging two lists. Because, when we do l4 = l4.next
- l4
is assigned a reference to a new memory location on the heap. And the link between l3
and l4
is broken (meaning, they now refer to different memory locations on the heap - instead of referring to the same memory location on the heap). Then how is it that l3 is able to merge the two linked lists (list1 and list2), perfectly?
- So, let us see, what happens, when we initially enter the
while
loop in the method -MergeTwoLists
.
- Originally, when we enter the
MergeTwoLists
method,list1
reference variable is on the stack, and points to the followingListNode
object on the heap -1->2->4->null
. 2.Similarly, originally, when we enter theMergeTwoLists
method,list2
reference variable is on the stack, and points to the followingListNode
object on the heap -1->3->4->null
.
Now, when we enter the while
loop the first time (in the method MergeTwoLists
),list1
points to 1->2->4->null
and list2
points to 1->3->4->null
. And since list1.val == list2.val
is true
, we go into the if
statement of the while
loop.
In the if
statement, the first line is l4.next = list1
. After executing l4.next=list1
, the l4
reference variable points to 0->1->2->4->null
on the heap. And since the l3
reference variable =
the l4
reference variable, l3
on the stack also points to the same location on heap, as l4
. That is l3
also points to 0->1->2->4->null
on the heap.
The next statement after this, in the if
block is: list1 = list1.next
, and the list1
reference variable now points to the following ListNode
object on the heap - 2->4->null
.
So the status of l4
and l3
, after the end of the if
block, the first time are:
l4
reference variable points to the ListNode
object 0->1->2->4->null
on the heap.
And, l3
reference variable also points to the same ListNode
object 0->1->2->4->null
on the heap.
Next, we skip the else block (since we have gone into the if
block). And execute the statement right below the else block - namely: l4 = l4.next.
The above statement - l4 = l4.next - is exactly where things got confusing for me.
in the above statement - l4 = l4.next
- the reference variable l4
is assigned a reference to a new object on the heap (that is, it is assigned l4.next
).
Since l4
is now assigned a reference to a new object, the link between l3 and l4 is broken.
That is, l4
now points to new ListNode
object on the heap - 1->2->4->null
.
But, l3
points to the original ListNode
object, that was there on the heap - 0->1->2->4->null
.
- Now, let us see what happens, when we go into the while loop, the second time -
When we enter the while loop the second time:
list1
points to the ListNode
object - 2->4->null
on the heap.
And, list2
points to the ListNode
object - 1->3->4->null
on the heap.
When we enter the while loop the second time, list1!=null
= true
&&
list2!=null
= true
, so we enter the while loop again.
This time, list1.val = 2 and list2.val = 1 - so, we won't go into the if
block and will instead go into the else
block.
The first statement in the else
block is - l4.next = list2
. This statement will result in the following:
l4
reference variable will now point to the following ListNode
object on the heap - 1->1->3->4->null
.
The next line after this in the else
block is: list2 = list2.next
. When we execute list2 = list2.next
, the list2
reference variable will point to the following ListNode
object on the heap - 3->4->null
.
Again, just after the else block, we meet the statement - l4 = l4.next
. And the l4
reference variable will now point to the following ListNode
object on the heap - 1->3->4->null
.
My question is: How is l3
further getting effected by changes to l4
? - since the link between l3
and l4
got broken as soon as we assigned a new object reference to l4
(that is, as soon as we did l4 = l4.next
, the first time at the end of the while
loop)?
Note: By link getting broken, I mean l3
and l4
now do not point to the same object on the heap. But now point to different objects on the heap.
But, when I run the above code in visual studio, l3
follows l4
- i.e. l3
creates a merged sorted linked list perfectly.
Actual correct result: When the program runs in visual studio, it returns the l3
reference variable that points to the following ListNode
object on heap - 0->1->1->2->3->4->4->null
.
Expected result, by my logic here: when we return l3
, it should return 0->1->2->4->null
.
To reiterate my logic: it is the following code snippet, that messes up merging of the two linked lists in sorted order - l4 = l4.next
. We meet this code snippet, the first time, when we enter the while
loop the first time (and henceforth).
And before we enter the while
loop the first time, everything is fine - l3
points to 0->null
on the heap. And since l4 = l3
, l4
also points to the same location on the heap (i.e. 0->null
).
When we go inside the while
loop the first time, things are still fine - we enter the if
block in the while
loop and there we hit l4.next = list1
. This makes l4
point to 0->1->2->4->null
(on the heap). And since l4 = l3
- l3
points to the same location as l4
- 0->1->2->4->null
(on the heap).
But, as soon as we reach the line of code l4 = l4.next
the first time - l4
now points to a new object on the heap (1->2->4->null
). And l3
points to the original object on the heap (0->1->2->4->null
). From here on, whatever changes are made to l4
won't be reflected in l3
(since l3
points to a different memory location on heap - and l4
keeps pointing to different memory locations on the heap, from here on).
So, how come l3
in the end shows the merged lists in completely sorted order? - l3
should only be 0->1->2->4->null
, at the end of the method. So how does l3
give the correct sorted result? Any help would be greatly appreciated!
CodePudding user response:
Firstly, as exhaustively pointed out, The Stack Is An Implementation Detail, it's completely irrelevant to most discussions about reference and value types.
It's actually all about reference semantics, which you get by using class
, a opposed to value semantics with struct
. In other words: reference types are always passed by reference, a variable of such an object points to the object rather than being the object.
To come back to your question:
Essentially, the root/head object referred to by l3
always points to the head of the list. l4
represents the current node. Better variable names would have made this easier to understand.
l3
always points to the head object, this object can still access other linked objects using its next
variable, which is also a pointer (because of reference semantics).
l4
is first assigned to this node. When you modify l4.next
, you are also modifying the same object as l3.next
. Then when you assign l4 = l4.next;
you can also access that node via l3.next.next
.
There is no broken link. The statement "That is l3 also points to 0->1->2->4->null on the heap." should be "That is the object pointed to by l3 also points to 0->1->2->4->null." and that object remains, and was modified via l4.
It's the objects themselves which do the pointing, you therefore have double indirection: l3
variable points to a node, which itself points to l4
which points to another node and so on.
So when it loops back again, l4
now points to the second node in the linked list. l4
is modified so that it then points to the next node according to the merge criteria. Then we again assign l4 = l4.next;
so that we can attach the fourth node to the third.