Python Erroneously Modifying Values of Another Array-CodePudding

I am working on codifying the policy iteration for a Gridworld task using Python.

My idea was to have two arrays holding the Gridworld, one that holds the results of the previous iteration, and one that holds the results of the current iteration; however, once I wrote the code for it I noticed my values in my results were off because the array that holds the previous iteration was also being modified.

def policyIteration():
    # Init arrays
    arr1 = [[0 for x in range(5)] for y in range(5)]
    arr2 = [[0 for x in range(5)] for y in range(5)]

    # Set entire array to -1
    for idx1, val1 in enumerate(arr1):
        for idx2, val2 in enumerate(val1):
            arr1[idx1][idx2] = -1

    # Set termination states to 0
    arr1[0][0] = 0; arr1[4][4] = 0

    while ( not checkConverge( arr1, arr2 ) ):
        for i in range(5):
            for j in range(5):
                if ( arr2[i][j] != 0 ): # Don't modify the termination states
                    arr1[i][j] = piFunc( arr2, i, j )

Now this function depends on two other sub-functions: piFunc (which calculates the updated value for the cell on the current iteration) and checkConverge (which just returns whether the values are the same).

My piFunc is a horrific mess, but as far as I can tell, it's logically sound.

def piFunc( arr, idx1, idx2 ):
    if ( idx1 == 0 ):
        vUp = -1
    else:
        vUp = arr[idx1-1][idx2]
    
    if ( idx1 == 4 ):
        vDown = -1
    else:
        vDown = arr[idx 1][idx2]

    if ( idx2 == 0 ):
        vLeft = -1
    else:
        vLeft = arr[idx1][idx2-1]
    
    if ( idx2 == 4 ):
        vRight = -1 
    else:
        vRight = arr[idx1][idx2 1]

    val = -1   ( vUp * 0.25 )   ( vDown * 0.25 )   ( vLeft * 0.25 )   ( vRight * 0.25 )
    return val

In all of these, I never once try and assign anything to arr2 except at the very beginning of the while loop when I go to make the arrays the same. In fact, arr2 only appears in code 4 times! But when I go to check the arrays before and after I end up with something like this:

Before:

arr1:
   0  1  2  3  4
0  0 -1 -1 -1 -1
1 -1 -1 -1 -1 -1
2 -1 -1 -1 -1 -1
3 -1 -1 -1 -1 -1
4 -1 -1 -1 -1  0

arr2:
   0  1  2  3  4
0  0 -1 -1 -1 -1
1 -1 -1 -1 -1 -1
2 -1 -1 -1 -1 -1
3 -1 -1 -1 -1 -1
4 -1 -1 -1 -1  0

After:

arr1:
          0         1         2         3         4
0  0.000000 -1.750000 -2.187500 -2.296875 -2.324219
1 -1.750000 -2.375000 -2.640625 -2.734375 -2.764648
2 -2.187500 -2.640625 -2.820312 -2.888672 -2.913330
3 -2.296875 -2.734375 -2.888672 -2.944336 -2.714417
4 -2.324219 -2.764648 -2.913330 -2.714417  0.000000
arr2:
          0         1         2         3         4
0  0.000000 -1.750000 -2.187500 -2.296875 -2.324219
1 -1.750000 -2.375000 -2.640625 -2.734375 -2.764648
2 -2.187500 -2.640625 -2.820312 -2.888672 -2.913330
3 -2.296875 -2.734375 -2.888672 -2.944336 -2.714417
4 -2.324219 -2.764648 -2.913330 -2.714417  0.000000

Why are the values in arr2 changing at all?

CodePudding user response：

I believe the comment from @TimRoberts above correctly diagnoses the issue.

These arrays currently reference the same object, so any update to one updates the other.

When you initialize arr2 = arr1, it creates a reference to the same object in memory. This makes it such that when one object is updated, so is the values of the other.

To create an array without this pointer style reference you can use:

arr2 = arr1[:]

So:

arr1 = [1, 2 ,3]
arr2 = arr1[:]
arr2.append(4)
print(arr1) # prints [1, 2, 3]
print(arr2) # prints [1, 2, 3, 4]

Example and proof of concept (screenshot of above code): proof of concept

Link to old stack overflow post on this concept: python list by value not by reference