Home > OS >  Why does [defaultdict(int)] * 3 return three references to the same object
Why does [defaultdict(int)] * 3 return three references to the same object

Time:10-14

EDIT:

Why does [defaultdict(int)] * 3 return three references to the same object?

Original Title

Unpack list of defaultdicts into variables has unexpected behavior in Python

Unpacking an initialized list of defaultdict types into variables does not appear to work the way I would expect it to. Does anyone know why this behaves this way (see code snippets below)? I'm using using Python 3.9.1.

# Equivalent behavior - works OK
a,b,c = [int(), int(), int()]
d,e,f = [int()] * 3

# Expected equivalent behavior - BEHAVES DIFFERENTLY
l,m,p = [defaultdict(int), defaultdict(int), defaultdict(int)]
q,r,s = [defaultdict(int)] * 3

Full snippet:

>>> a,b,c = [int(), int(), int()]
>>> a =4; b =2; c =7
>>> a,b,c
(4, 2, 7)
>>> d,e,f = [int()] * 3
>>> d =11; e =8; f = 41
>>> d,e,f
(11, 8, 41)

>>> from collections import defaultdict
>>> l,m,p = [defaultdict(int), defaultdict(int), defaultdict(int)]
>>> l['a'] =1; m['b'] =2; m['c'] =3;
>>> l,m,p
(
  defaultdict(<class 'int'>, {'a': 1}), 
  defaultdict(<class 'int'>, {'b': 2, 'c': 3}), 
  defaultdict(<class 'int'>, {})
)
>>> q,r,s = [defaultdict(int)] * 3
>>> q['a'] =111; r['b'] =222; m['c'] =333;
>>> q,r,s
(
  defaultdict(<class 'int'>, {'a': 111, 'b': 222}), 
  defaultdict(<class 'int'>, {'a': 111, 'b': 222}), 
  defaultdict(<class 'int'>, {'a': 111, 'b': 222})
)

This question is based on the topic posed by the question "Unpack list to variables".

CodePudding user response:

The issue is with locations in memory. A simple console test shows this:

> from collections import defaultdict
> l,m,p = [defaultdict(int), defaultdict(int), defaultdict(int)]
> id(l) == id(p)
False
> id(m) == id(p)
False

Now let's try the other way:

> l,m,p = [defaultdict(int)] * 3
> id(l) == id(p)
True
> id(m) == id(p)
True

In the first case, you are creating three separate slots in memory. In the second, you are creating one spot in memory and then creating two additional pointers to that slot in memory; thus when you update one, they all change since they are all pointing to the same slot in memory.

This answer goes into some more detail on why this happens with certain datatypes, but not others. TL;DR - small ints can be in the same object but with different pointers for the sake of optimization. That's why you can run the id() or is checks on the integer variables and see that they point to the same object, but have them behave independently when modifying each one.

  • Related