Home > Mobile >  What is the difference between dict[new_key] = [dict[key], new_value] and x = dict[key] [new_value
What is the difference between dict[new_key] = [dict[key], new_value] and x = dict[key] [new_value

Time:05-21

I was reading the Python Patterns essay on graph implementations, and came across the section that discusses the optimal implementation of finding the shortest path. The article is here.

I couldn't understand the difference between the two assignments to dist[next]; the latter is quadratic in runtime, but the former is linear?

Linear runtime:

# Code by Eryk Kopczyński 
def find_shortest_path(graph, start, end):
    dist = {start: [start]}
    q = deque(start)
    while len(q):
        at = q.popleft()
        for next in graph[at]:
            if next not in dist:
                dist[next] = [dist[at], next]  # this line
                q.append(next)
    return dist.get(end)

Quadratic runtime:

# Code by Eryk Kopczyński 
def find_shortest_path(graph, start, end):
    dist = {start: [start]}
    q = deque(start)
    while len(q):
        at = q.popleft()
        for next in graph[at]:
            if next not in dist:
                dist[next] = dist[at]   [next]  # this line
                q.append(next)
    return dist.get(end)

Can someone please explain what the difference is between these two assignments?

CodePudding user response:

[ ] bracket notation is used for both subscript de-reference and for constructing lists.

You asked about these expressions:

            dist[next] = [dist[at], next]

            dist[next] = dist[at]   [next]

Let's ignore the LHS left hand side of the assignment, as it is simply storing a dict mapping. Rather than dist[next] = ... it could as easily have been x = ... followed by dist[next] = x.

Similarly, in the RHS let's ignore the dist[at]. We easily could have set a temp var beforehand: y = dist[at], and then use y in the expression.


Ok, with square bracket notation out of the way, let's focus on the algorithmic complexity of constructing [y, next] and y [next] expressions, where y is a list.

We can compute the first expression in O(1) constant time. We simply allocate a short list and fill its first two elements with pointers to the y and to the next objects.

The second list expression, y [next], requires scanning each element of y and copying it into a new temporary result, then finally appending a next pointer to that result. This has O(n) linear cost, if y is of length n.

Explanation

Why was that extra work necessary? Well suppose an element of y is subsequently mutated, changed to a new value. The first expression is fine with that, it will reflect the new value, since the programmer asked to store a pointer in the list, a reference to what y was pointing at.

OTOH the second expression won't preserve that relationship, it is looking just at the values on either side of the plus sign. Imagine that we store that "big list" result and then mutate y. There will be no effect on the big list, which is as it should be.

  • Related