Home > Net >  Is there a need for transitivity in Python __eq__?
Is there a need for transitivity in Python __eq__?

Time:03-15

I'm implementing my own class, with custom __eq__. And I'd like to return True for things that are not "equal" in a mathematical sense, but "match" in a fuzzy way.

An issue with this is, however, that this leads to loss of transitivity in a mathematical sense, i.e. a == b && b ==c, while a may not be equal to c.

Question: is Python dependent on __eq__ being transitive? Will what I'm trying to do break things, or is it possible to do this as long as I'm careful myself not to assume transitivity?

Use case

I want to match telephone numbers with one another, while those may be either formatted internationally, or just for domestic use (without a country code specified). If there's no country code specified, I'd like a number to be equal to a number with one, but if it is specified, it should only be equal to numbers with the same country-code, or without one.

So:

  • Of course, 31 6 12345678 should equal 31 6 1234567, and 06 12345678 should equal 06 12345678
  • 31 6 12345678 should equal 06 12345678 (and v.v.)
  • 49 6 12345678 should equal 06 12345678 (and v.v.)
  • But 31 6 12345678 should not be equal to 49 6 12345678

Edit: I don't have a need for hashing (and so won't implement it), so that at least makes life easier.

CodePudding user response:

There is no MUST but a SHOULD relation for comparisons being consistent with the commonly understood relations. Python expressively does not enforce this and float is an inbuilt type with different behaviour due to float("nan").

Expressions: Value comparisons

[…]
User-defined classes that customize their comparison behavior should follow some consistency rules, if possible:

  • […]
  • Comparison should be transitive. The following (non-exhaustive) examples illustrate that:
    • x > y and y > z implies x > z
    • x < y and y <= z implies x < z

Python does not enforce these consistency rules. In fact, the not-a-number values are an example for not following these rules.

Still, keep in mind that exceptions are incredibly rare and subject to being ignored: most people would treat float as having total order, for example. Using uncommon comparison relations can seriously increase maintenance effort.


Canonical ways to model "fuzzy matching" via operators are as subset, subsequence or containment using unsymmetric operators.

  • The set and frozenset support >, >= and so on to indicate that one set encompases all values of another.
    >>> a, b = {1, 5, 6, 8}, {5, 6}
    >>> a >= a, a >= b, b >= a
    (True, True, False)
    
  • The str and bytes support in to indicate that subsequences are covered.
    >>> a, b = " 31 6 12345678", "6 12345678"
    >>> a in b, b in a
    (False, True)
    
  • The range and ipaddress Networks support in to indicate that specific items are covered.
    >>> IPv4Address('192.0.2.6') in IPv4Network('192.0.2.0/28')
    True
    

Notably, while these operators may be transitive they are not symmetric. For example, a >= b and c >= b does not imply b >= c and thus not a >= c or vice versa.

Practically, one could model "number without country code" as the superset of "number with country code" for the same number. This means that 06 12345678 >= 31 6 12345678 and 06 12345678 >= 49 6 12345678 but not vice versa. In order to do a symmetric comparison, one would use a >= b or b >= a instead of a == b.

CodePudding user response:

__eq__ method should be transitive; at least it is what dictionaries assume.

class A:
    def __init__(self, name):
        self.name = name

    def __eq__(self, other):
        for element in self.values:
            if element is other:
                return True
        return False

    def __hash__(self):
        return 0

    def __repr__(self):
        return self.name

x, y, z = A('x'), A('y'), A('z')
x.values = [x,y]
y.values = [x,y,z]
z.values = [y,z]

print(x == y)
--> True

print (y == z)
--> True

print(x == z)
--> False

print({**{x:1},**{y:2, z: 3}})
--> {x: 3}

print({**{x:1},**{z:3, y:2}})
--> {x: 1, z: 2}

{**{x:1},**{y: 2, z:3}} is the union of two dictionaries. No one expects a dictionary to delete a key after updating it.

print(z in {**{x:1},**{y:2, z: 3}})
--> False

By changing the order in the union you can even get different sized dictionaries:

print(len({**{x:1},**{y:2, z: 3}}))
--> 1

print(len({**{x:1},**{z:3, y:2}}))
--> 2
  • Related