Home > Software engineering >  Python: what is the difference of adding an object to a set by id() or directly?
Python: what is the difference of adding an object to a set by id() or directly?

Time:10-04

Assume I have a custom class CustomObject and I do not define a custom __hash__ or __eq__ function for it. Will there be any difference between the following two operations in terms of outputs in any conditions?

a = CustomObject(1)
b = CustomObject(1)
setA = set()

# option 1
setA.add(a)
print((b in setA))

# option 2
setA.add(id(a))
print((id(b) in setA))

According to What is the default __hash__ in python?, the default __hash__ function is bound to the id of the object, so I assume there is no difference between the above two options?

If I define custom __hash__ functions for CustomObject like in add object into python's set collection and determine by object's attribute, the above two options will be different, right?

CodePudding user response:

Saving the ID can result in a false positive if any of the objects become garbage and the ID is reassigned.

a = CustomObject(1)
setA = set()
setA.add(id(a))
del a
b = CustomObject(1)
print(id(b) in setA)

This would print True if b gets the same ID that a previously had.

CodePudding user response:

The same reason as that mentioned by @Barmar, a phenomenon that is easier to reproduce is that only one address can be obtained by adding temporary CustomObject for many times:

>>> class CustomObject:
...     def __init__(self, value):
...         self.value = value
...
>>> {id(CustomObject(1)) for _ in range(10)}
{1799037490496}
>>> {id(CustomObject(i)) for i in range(10)}
{1799034371856}

In addition, you can only get the address instead of the object you added when iterating over the set. There are methods in the ctypes library that can get the object through the address, but when the object is destroyed, it is not safe to get it through the address.

  • Related