Understanding Scope / Copy in Python Classes where arguments default to numpy vectors-CodePudding

I have a pretty simple python-3 code that is puzzling me.

test.py :

import numpy as np

class PARTICLE:
    def __init__(self, PosV = np.zeros(3), Mass=0):
        self.posV = PosV
        self.mass = Mass

def main():
    pL = []
    for i in range(10):
        p = PARTICLE(Mass=0)    
        pL.append(p)
    pL[0].posV[0] = 10         ### ERROR, modifies pL[1].posV[0] as well
    pL[0].mass    = 42
    print(pL[0].posV[0])
    print(pL[1].posV[0])       ### Unexpected to be = 10, must be same memory
    print(pL[2].posV[0])       ### Unexpected to be = 10, must be same memory

    print(pL[0].mass)
    print(pL[1].mass)
    print(pL[2].mass)

if __name__ == "__main__":
    main()

When I run it :

$ python test.py
10.0
10.0
10.0
42
0
0

It seems that when I create a new PARTICLE object, it looks like the default posV for each new particle points to the same block of memory because if I change pL[0].posV[0] it ALSO changes pL[1].posV[1]. However for args that default to scalars (e.g. Mass), changine pL[0].mass does NOT propagate to pL[1].mass.

QUESTION :

Please explain why modifying pL[0].posV[0] ALSO changes pL[1].posV[0]. What is going on here?

I'm suspect that it has to do with pointers and deep vs shallow copy, but I'm not sure what is exactly what is going on. Intuitively, I'd expect creating a new PARTICLE instance should create a completely new memory instance, with each new PARTICLE object being independent of the previous ones. Clearly that is not the case.

CodePudding user response：

It seems that when I create a new PARTICLE object, it looks like the default posV for each new particle points to the same block of memory because if I change pL[0].posV[0] it ALSO changes pL[1].posV[1]

In Python, default arguments are evaluated once when the function is defined, not each time the function is called. This means that in your example pL[0].posV, pL[1].posV, etc. all point to the same object (the numpy array returned by np.zeros(3)) as you said. Therefore changes in one are reflected in the other references too.

However for args that default to scalars (e.g. Mass), changine pL[0].mass does NOT propagate to pL[1].mass.

The difference is that numpy arrays are mutable objects, and by doing

pL[0].posV[0] = 10

you are mutating (updating the first element) the numpy array that pL[0].posV points to, while

pL[0].mass = 42

is a completely different operation. It creates a whole new object (the integer 42) and assigns it back to pL[0].mass. pL[0].mass now refers to a different object. Note that integers are immutable, so you cannot change the object in any way and reflect that change in all references to that object.

I strongly recommend you read these excellent blog posts: