Home > database >  Unexpected behaviour when testing equality with naive and tz-aware datetime instances
Unexpected behaviour when testing equality with naive and tz-aware datetime instances

Time:04-30

The following was produced in Python 3.9.7.

I am well aware that the comparison between tz-aware and naive datetime instances is not allowed in Python and raises a TypeError. However, when testing for equality (with the == and != operators) this is actually not the case. In fact, the comparison always returns False:

import datetime
import pytz

t_tz_aware = datetime.datetime(2020, 5, 23, tzinfo=pytz.UTC)
t_naive = datetime.datetime(2020, 5, 23)

# Prints 'False'.
print(t_tz_aware == t_naive)

# Raises TypeError: can't compare offset-naive and offset-aware datetimes.
print(t_tz_aware < t_naive)

I checked the source code of the datetime library and the function for comparing datetime objects has a parameter called allow_mixed (which defaults to False):

def _cmp(self, other, allow_mixed=False)

When set to True, which is the case when comparing using the == operator, it is possible to compare tz-aware and naive datetime instances. Otherwise, it raises a TypeError:

# When testing for equality, set allow_mixed to True.
# For all the other operators, it remains False.
def __eq__(self, other):
   if isinstance(other, datetime):
      return self._cmp(other, allow_mixed=True) == 0
if myoff is None or otoff is None:
   if allow_mixed:
      return 2 # arbitrary non-zero value
   else:
      raise TypeError("cannot compare naive and aware datetimes")

So, it really seems intended behaviour. In fact, Pandas' implementation of the comparisons of pandas.Timestamps and similar is consistent with this.

My question is, what is the reasoning? I suppose, like the name of the parameter says, this way we can filter collections of datetime objects that contain both naive and tz-aware instances (i.e., "mixed"). But wouldn't this just introduce a source of potential bugs and unintended behaviour? What am I missing?

EDIT after deceze's comments: this is in fact still "semantically correct" (i.e., the dates are for sure different).

CodePudding user response:

An equality comparison virtually never raises an exception. You just want to know whether object A is equal to object B, which has a clear answer. If they're equal for however those objects define equality, the answer is yes, in all other cases it's no. A naïve datetime differs from an aware one at least insofar as one has a timezone and the other doesn't, so they're clearly not equal.

For greater-than/lower-than comparisons though, objects clearly need to be orderable, otherwise the question cannot be answered. A < comparison can't return False just because the objects can't be compared, because that would mean that the opposite >= comparison should return True, which it also can't. So raising an error in this case is the correct third possible outcome.

  • Related