Home > database >  If a list is unhashable in Python, why is a class instance with list attribute not?
If a list is unhashable in Python, why is a class instance with list attribute not?

Time:01-14

Firstly, we have a normal list:

ingredients = ["hot water", "taste"]

Trying to print this list's hash will expectedly raise a TypeError:

print(hash(ingredients))

>>> TypeError: unhashable type: 'list'

which means we cannot use it as a dictionary key, for example.

But now suppose we have a Tea class which only takes one argument; a list.

class Tea:
    
    def __init__(self, ingredients: list|None = None) -> None:

        self.ingredients = ingredients
        if ingredients is None:
            self.ingredients = []

Surprisingly, creating an instance and printing its hash will not raise an error:

cup = Tea(["hot water", "taste"])
print(hash(cup))

>>> 269041261

This hints at the object being hashable (although pretty much being identical to a list in its functionality). Trying to print its ingredients attribute's hash, however, will raise the expected error:

print(hash(cup.ingredients))

>>> TypeError: unhashable type: 'list'

Why is this the case? Shouldn't the presence of the list — being an unhashable type — make it impossible to hash any object that 'contains' a list? For example, now it is possible to use our cup as a dictionary key:

dct = {
    cup = "test"
}

despite the fact that the cup is more or less a list in its functionality. So if you really want to use a list (or another unhashable type) as a dictionary key, isn't it possible do do it in this way? (not my main question, just a side consequence)

Why doesn't the presence of the list make the entire datatype unhashable?

CodePudding user response:

Oops, it looks like you did not understand what a hash is. The rule is that a hash should never change over the whole life of an object, and should be compatible with equality. It does not matter whether the object changes or not.

2 distinct list objects having same elements will compare equal. That is the actual reason for a list not being hashable. Suppose we manage to compute a hash for a class that would mimic a list including the equality part. Let us create two distinct instances with distinct hash values. No problem till here. Now let us create a third instance having the same elements as the first one. They will compare equal so their hash values should be the same. But if we change the elements of that third instance to be the elements of the second one, its hash should be the same as the one of the second instance - which is forbidden since a hash value shall not change over the lifetime of an object.

But you have only created a class that happens to contain a list. By default, 2 distinct instances will not compare equal even if they contain identical lists. Because of that, your class will be hashable and the hash will be the address of the object in CPython. It will only become non hashable if you add an __eq__ special method that would make two objects compare equal if the lists that they contain are, because the hash function will no longer be able to at the same time be compatible with equality and never change over the lifetime of the object.

  • Related