I am currently working through C Programming A Modern Approach by K.N. King, and one of the exercise questions is:
The following function is supposed to insert a new node into its proper place in an ordered list, returning a pointer to the first node in the modified list. Unfortunately, the function doesn't word correctly in all cases. Explain what's wrong with it and show how to fix it. Assume that the node structure is the one defined in Section 17.5.
struct node
{
int value;
struct node next;
};
struct node *insert_into_ordered_list(struct node *list, struct node *new_node)
{
struct node *cur = list, *prev = NULL;
while (cur->value <= new_node->value) {
prev = cur;
cur = cur->next;
}
prev->next = new_node;
new_node->next = cur;
return list;
}
After trying to understand this for a while and struggling, I eventually came across another stackoverflow question where someone posted this as their answer. I currently don't have enough reputation to ask about it on there, so I'm asking here to try to wrap my head around this.
struct node * insert_into_ordered_list( struct node *list, struct node *new_node )
{
struct node **pp = &list;
while ( *pp != NULL && !( new_node->value < ( *pp )->value ) )
{
pp = &( *pp )->next;
}
new_node->next = *pp;
*pp = new_node;
return list;
}
Could someone explain to me how the previous node, the one before the inserted new_node is being updated to point at the inserted new_node? I'm guessing the line *pp = new_node;
has something to do with it, but I can't understand how. Thank you.
CodePudding user response:
As you said, pp
is a pointer to the pointer to the "current node in the position in the list" - this is sometimes called a "handler" in the literature. The name pp
comes from "pointer to pointer".
*
is the "dereference operator", specifically it means "the data at", so *pp
means "the data at memory location pp," which in this case is the actual pointer to the current node.
When you use an assignment operator with a derefernce, you are saying set the data at memory location pp
to new_node
(that data happens to also be a pointer). Remember, new_node
is a pointer to your new node's data. So when you run the line:
new_node->next = *pp;
you are setting the "next" entry in the data at new_node equal to the pointer to the "current" node. Then when you run the line:
*pp = new_node;
you are updating the pointer at location pp to point to the structured data in struct node
format at new_node
.
Sometimes when unwrapping all these pointers and dereferences in C, it helps me to make sure I understand the type of every expression and subexpression.
For example here are various expressions above, and their types, expressed as boolean operations in modern C using the typeof
operator. Note that in a real program, at compile time these will be replaced with the value 1
("truthy" in C) :)
typeof(new_node) == struct node *;
typeof(pp) == struct node **;
typeof(*pp) == struct node *;
typeof(*new_node) = struct node;
Note that since in C, the =
operator causes memory to be copied to cause the left side of the expression to be equal to the right side in any future evaluations (commonly referred to as the lvalue
and the rvalue
of the expression). So in the parlance, After an =
the lvalue
could evaluate differently than before. The rvalue
is used for calculating the new value of the lvalue
, but itself remains unmodified after the assignment operation.
It is important to remember, anytime you use =
you are blowing over any data in the lvalue
. This is often confusing, as =
is the ONLY operator in C that causes "side-effects" (Sometimes ()
is also considered an operator, of course function calls can also cause side-effects, as in this example using a pointer argument to the function and dereferencing it within the function body).
ALL other operators simply evaluate inside expressions (for example, *
, &
,
etc.), but when you use =
it makes changes. For example, any given expression containing the lvalue
or anything dependent on the lvalue
might evaluate to a different value before and after an =
operation. Because functions can have pointer arguments as in this example, function calls can also cause side effects.
You can also, more simply, think of *
as an operator that "removes *
's" from types, and &
as an operator that "adds *
's" to types - as above they are called the dereference and reference operator.
CodePudding user response:
pp
either initially points to the pointer head
or due to the while loop to the data member next
of some node
while ( *pp != NULL && !( new_node->value < ( *pp )->value ) )
{
pp = &( *pp )->next;
}
For example if pp
points to head
. If head
is equal to NULL
or new_node->value < head->value
then these statements
new_node->next = *pp;
*pp = new_node;
in fact are equivalent to
new_node->next = head;
head = new_node
If pp
points to the data member next
of some node then these statements
new_node->next = *pp;
*pp = new_node;
change the value of the of the current data member next
with the address of the new node
*pp = new_node;
and before this the data member next
of the new node is set to the value stored in the data member next
of the current node.
As for this function
struct node *insert_into_ordered_list(struct node *list, struct node *new_node)
{
struct node *cur = list, *prev = NULL;
while (cur->value <= new_node->value) {
prev = cur;
cur = cur->next;
}
prev->next = new_node;
new_node->next = cur;
return list;
}
then there is no check whether the pointer cur
becomes (or initially is) equal to NULL
. And secondly the pointer list
is not changed when the new node is inserted before the node pointed to by list
.
The function could be defined the following way
struct node *insert_into_ordered_list(struct node *list, struct node *new_node)
{
if ( list == NULL || new_node->value < list->value )
{
new_node->next = list;
list = new_node;
}
else
{
struct node *cur = list;
while ( cur->next != NULL && cur->next->value <= new_node->value)
{
cur = cur->next;
}
new_node->next = cur->next;
cur->next = new_node;
}
return list;
}