I want to replace string keys in dictionaries in my code with a dataclass
so that I can provide meta data to the keys for debugging. However, I still want to be able to use a string to lookup dictionaries. I tried implementing a data-class with a replaced __hash__
function, however my code is not working as expected:
from dataclasses import dataclass
@dataclass(eq=True, frozen=True)
class Key:
name: str
def __hash__(self):
return hash(self.name)
k = "foo"
foo = Key(name=k)
d = {}
d[foo] = 1
print(d[k]) # Key Error
The two hash functions are the same:
print(hash(k) == hash(foo)) # True
So I don't understand why this doesn't work.
CodePudding user response:
Two objects having different hashes guarantees that they're different, but two objects having the same hash doesn't in itself guarantee that they're the same (because hash collisions exist). If you want the Key
to be considered equal to a corresponding str
, implement that in __eq__
:
def __eq__(self, other):
if isinstance(other, Key):
return self.name == other.name
if isinstance(other, str):
return self.name == other
return False
This fixes the KeyError
you're encountering.
CodePudding user response:
Adding my notes here from the comments on the answer above, as no one looks at those in any case, so those are likely to get swept under the rug at some point.
PyCharm also produces a helpful warning:
'eq' is ignored if the class already defines '
__eq__
' method.I think this means to remove the
eq=True
usage as well, from the@dataclass(...)
decorator.technically, you could also remove the last
if isinstance(..., str):
as well as the lastreturn
statement. I'm not entirely sure what would be the implications of that, however.
Here then, is a slightly more optimized approach (timings with timeit
module below):
class Key:
name: str
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
return self.name == getattr(other, 'name', other)
Timings with timeit
from dataclasses import dataclass
from timeit import timeit
@dataclass(frozen=True)
class Key:
name: str
def __hash__(self):
return hash(self.name)
def __eq__(self, other):
if isinstance(other, Key):
return self.name == other.name
if isinstance(other, str):
return self.name == other
return False
class KeyTwo(Key):
def __eq__(self, other):
return self.name == getattr(other, 'name', other)
k = "foo"
foo = Key(name=k)
foo_two = KeyTwo(name=k)
print('__eq__() Timings --')
print('isinstance(): ', timeit("foo == k", globals=globals()))
print('getattr(): ', timeit("foo_two == k", globals=globals()))
assert foo == foo_two == k
Results on my M1 Mac:
__eq__() Timings --
isinstance(): 0.10553250007797033
getattr(): 0.08371329202782363