Suppose that tuppy
and guppy
are:
- both instantiated from the
tuple
class - the only elements of
tuppy
andguppy
are integers such as0
,1
,2
, etc...
Is it the case that hash(tuppy) == hash(guppy)
if and only if tuppy == guppy
?
Note that the following all amount to nearly the same thing:
number = hash(t)
number = t.__hash__()
number = tuple.__hash__(t)
Will hash(x) == hash(y)
always return True
for examples like the following?
x = tuple([0, 1])
y = tuple(range(0, 2))
print("x == ".ljust(20), type(x), repr(x))
print("y == ".ljust(20), type(y), repr(y))
print("hash(x) == hash(y)".ljust(20), hash(x) == hash(y))
I have two concerns:
two different tuples hashing to the same value. For example
(1234, 5)
and(1, 2345)
might, theoretically, both hash to the same value.hash(copy.copy(x))
might be different thanhash(x)
.Do tuples have the behavior that
hash(x) == hash(y)
if and only ifx == y
, provided that:- the instances of
tuple
come directly fromtuple
and are not instantiated from a sub-class oftuple
? - the elements of the
tuple
are integers?
- the instances of
The above assumes that we are working with tuples of integers.
What about nested trees of tuples?
import copy
t1a = (1, (2, (3, (4, 5), 6)))
t1b = tuple(eval("[1, (2, (3, (4, 5), 6))]"))
t2a = ((((1, 2), 3), 4), 5, 6)
t2b = copy.deepcopy(t2a)
print("hash(t1a) == hash(t1b)", hash(t1a) == hash(t1b))
print("hash(t1a) == hash(t2a)", hash(t1a) == hash(t2a))
print("hash(t2a) == hash(t2b)", hash(t2a) == hash(t2b))
I know the answer for a smattering of 3 or 4 test-cases, but what about in general?
If you have huge tuples (say, 1 gigabyte each) will the hash values (int
) get truncated? I could imagine that the concatenation of x
and y
has the same hash value as x
if x
is sufficiently long.
Suppose we had a million different tuples in memory. We might begin to have many different tuples associated with the same hash value.
CodePudding user response:
As usual for well-formed hash functions, a == b
implies hash(a) == hash(b)
, but hash(a) == hash(b)
does not necessarily imply that a == b
.
CodePudding user response:
Regardless of type, a __hash__()
implementation is playing by the rules if x == y
implies hash(x) == hash(y)
. There is no reliable implication in the other direction: from hash(x) == hash(y)
, nothing follows about how x
compares to y
.
This is all true even for simple types like ints. There are an unbounded number of possible Python ints, but only a finite number of possible hash codes. Therefore there must exist distinct ints i
and j
such that hash(i) == hash(j)
.