I have a pretty simple python-3 code that is puzzling me.
test.py
:
import numpy as np
class PARTICLE:
def __init__(self, PosV = np.zeros(3), Mass=0):
self.posV = PosV
self.mass = Mass
def main():
pL = []
for i in range(10):
p = PARTICLE(Mass=0)
pL.append(p)
pL[0].posV[0] = 10 ### ERROR, modifies pL[1].posV[0] as well
pL[0].mass = 42
print(pL[0].posV[0])
print(pL[1].posV[0]) ### Unexpected to be = 10, must be same memory
print(pL[2].posV[0]) ### Unexpected to be = 10, must be same memory
print(pL[0].mass)
print(pL[1].mass)
print(pL[2].mass)
if __name__ == "__main__":
main()
When I run it :
$ python test.py
10.0
10.0
10.0
42
0
0
It seems that when I create a new PARTICLE object, it looks like the default posV for each new particle points to the same block of memory because if I change pL[0].posV[0]
it ALSO changes pL[1].posV[1]
. However for args that default to scalars (e.g. Mass), changine pL[0].mass
does NOT propagate to pL[1].mass
.
QUESTION :
- Please explain why modifying
pL[0].posV[0]
ALSO changespL[1].posV[0]
. What is going on here?
I'm suspect that it has to do with pointers and deep vs shallow copy, but I'm not sure what is exactly what is going on. Intuitively, I'd expect creating a new PARTICLE instance should create a completely new memory instance, with each new PARTICLE object being independent of the previous ones. Clearly that is not the case.
CodePudding user response:
It seems that when I create a new PARTICLE object, it looks like the default posV for each new particle points to the same block of memory because if I change
pL[0].posV[0]
it ALSO changespL[1].posV[1]
In Python, default arguments are evaluated once when the function is defined, not each time the function is called. This means that in your example pL[0].posV
, pL[1].posV
, etc. all point to the same object (the numpy array returned by np.zeros(3)
) as you said. Therefore changes in one are reflected in the other references too.
However for args that default to scalars (e.g. Mass), changine pL[0].mass does NOT propagate to pL[1].mass.
The difference is that numpy arrays are mutable objects, and by doing
pL[0].posV[0] = 10
you are mutating (updating the first element) the numpy array that pL[0].posV
points to, while
pL[0].mass = 42
is a completely different operation. It creates a whole new object (the integer 42) and assigns it back to pL[0].mass
. pL[0].mass
now refers to a different object. Note that integers are immutable, so you cannot change the object in any way and reflect that change in all references to that object.
I strongly recommend you read these excellent blog posts: