Home > Software design >  Id of list versus str slices
Id of list versus str slices

Time:09-22

Given a list a = [1,2,3], why is the following statement true

id(a[:]) == id(a[:]) # true

while the following one is false?

b = a[:]
id(b) == id(a[:]) # false

Also, if I instead use a string (in place of a list), then both statements are true. Why? What am I missing?

CodePudding user response:

The id is defined to be unique for an object across its lifetime. That is, two separate objects existing at the same time cannot have the same id. However, two separate objects existing at different time as well as objects not required to be separate may have the same id.

id(object)

Return the “identity” of an object. This is an integer which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.


Thus, one has to be mindful of two things when reasoning about id: When must the lifetime of objects overlap, and when must two objects be separate.

When the objects whose id we look at are created only for the id:

>>> #  /  / first a[:]
>>> #  v  v       v    v second a[:]
>>> id(a[:]) == id(a[:])
True

then the object are not required to exist at the same time. Each id(a[:]) expression can create the slice, get its id and then discard the slice before the equality between the ids is ever checked. This means both slice can have the same id as they never exist at the same time.

In contrast, when a slice is assigned to a variable it has to exist at least as long as the variable. Thus, when we check the id of an object via a variable

>>> b = a[:]           # < ------------- first a[:]
>>> id(b) == id(a[:])  # < second a[:]     |
False
>>> b                  # < ----------------/

its lifetime overlaps with that of the temporary slice. This means both slices must not have the same id as they never exist at the same time…


iff slicing must create separate objects.

When comparing the behaviour of list and str, the key difference is that the latter does not have behaviour depending on its identity – roughly, this corresponds to mutable and immutable types.

When working with lists, identity is important because we can mutate a specific object. Even if two objects have the same initial value, mutation has a different effect:

>>> a, b = [1, 2, 3], [1, 2, 3]
>>> c = a         # a, b, c have same value
>>> c  = [4]      # changing c has different effect on a and b
>>> a == b
False

When working with str, identity is irrelevant because we cannot mutate a specific object. If two objects have the same initial value, immutability guarantees they will always have the same value:

>>> a, b = "123", "123"
>>> c = a         # a, b, c have same value
>>> c  = "4"       # changing c has *no* effect on a and b
>>> a == b
True

As a result, slicing a mutable list to a new list must always create a new object. Otherwise, mutating the slice would have unreliable behaviour.
In contrast, slicing an immutable str to a new str may create a new object. Even if it always provides the same object the behaviour is the same.


As a result of how id is defined – in respect to lifetime and separation – a Python implementation must use separate ids in specific cases but may use separate ids in all other cases.

In specific, a Python implementation is free to re-use ids if objects don't exist at the same time and is free to share ids if behaviour does not depend on identity.

CodePudding user response:

CPython implementation detail: This is the address of the object in memory.

Memory has limited room and you can treat them as a storage house with shelves, you might want to label every shelf with some digits or somewhat so you can find their precise position for convenience.

Every time you create new stuff they all must have a place to put it somewhere. you create a variable a, which means it's an object and it was labeled some address on it. you create a value [1, 2, 3], it's another object and it still labeled some address on it. Then you say
a = 50; you allocate 50 to a variable named a.

Under the hood, it says a's address will reference to another address where 50 lives. (it's pointer)

A variable is an object. A pile of data is an object. Actually, a computer itself doesn't need a variable called a. It already has the address of [1, 2, 3], it knows where it is in memory. The reason we need a variable called a is that we human beings need a name to represent this pile of data instead of using an address.

The example in C:

#include <stdio.h>

int main()
{
    int a ;
    printf("The address of a is %p\n\n", &a);

    a = 55;
    printf("The address of a is %p\n", &a);
    printf("The address of 55's pointer is %p\n\n", 55);

    a = 30;
    printf("The address of a is %p\n", &a);
    printf("The address of 30's pointer is %p\n\n", 30);    
}
// The address of a is 0x7fffb44b83dc

// The address of a is 0x7fffb44b83dc
// The address of 55's pointer is 0x37

// The address of a is 0x7fffb44b83dc
// The address of 30's pointer is 0x1e

you can check this for further reading

Back to here, whatever value we create after there existing a = [1, 2, 3]

a[0], a[1], a[:2], a[:], a[::-1], etc. will occupy new space memory and has their own address individually, They are brand new objects since the script interpreted.

The a's address won't change whether you assign other value to it, it only leads to point to another value's address.

CodePudding user response:

just print the ids

a = [1,2,3]
print(id(a), id(a[:])) # "a" has id#001 and its copy id#002

b = a[:] # here "b" inherits id#002 and the first copy of "a" dies
print(id(b), id(a[:])) # here "b" is still id#002, "a" is a new copy with id#003
  • Related