Home > Enterprise >  Datetime discrepancy between Python and Java for dates long ago
Datetime discrepancy between Python and Java for dates long ago

Time:04-28

In Java via the Postgres JDBC driver, I'm running a simple query to select a very long ago datetime: select '0002-02-02 00:00:00'::timestamp. In my Java application gregorian calendar settings

The Postgres driver does enter image description here

However, if I use this same value in Python, I get a datetime that is two days different:

from datetime import datetime, timezone

num_millis=-62101468800000
num_secs=num_millis/1000

# Using the same # of seconds since the epoch as in Java, datetime is different by two days
dt = datetime.fromtimestamp(num_secs, tz=timezone.utc)
print("Datetime", dt)  # Datetime 0002-01-31 00:00:00 00:00
print("Num seconds", dt.timestamp(), "\n")  # Num seconds -62101468800.0 

# Using the same datetime string, seconds since epoch is different by two days worth of seconds
dt2 = datetime.fromisoformat('0002-02-02 00:00:00 00:00')
print("Datetime", dt2)  # Datetime 0002-02-02 00:00:00 00:00
print("Num seconds", dt2.timestamp())  # Num seconds -62101296000.0

Why is there a discrepancy between the Python code and Java code? (Is it possibly due to different calendar systems?) Is there a way to resolve this discrepancy?

CodePudding user response:

If you want the timestamp to be treated as a date on a proleptic Gregorian calendar (which is a big "if"), you can bypass the Julian date conversion that is otherwise performed using the getObject() method to return an OffsetDateTime.

OffsetDateTime odt = rs.getObject(i, OffsetDateTime.class);
/* Lossy conversion if you insist on saddling yourself with the misguided Timestamp type */
java.sql.Timestamp = Timestamp.valueOf(odt.toLocalDateTime());
/* Lossy conversion if you want to adopt a more reasonable representation */
LocalDate date = odt.toLocalDate();

This solution may be specific to the PostgreSQL JDBC Driver, but I've seen other drivers support JSR 310 temporal types via getObject(). It's likely to work with other databases, but test with each.

The discrepancy you see is due to the support for Julian dates that comes by default with Java's older date-time classes. The JSR 310 packages, alternatively, use a proleptic Gregorian calendar.

Because of various calendars in use contemporaneously, conflicting conventions about the first day of the year, and many other reasons, handling historical dates correctly depends heavily on your use cases. If you care about relative ordering and the interval between dates, storing dates according to a proleptic Gregorian calendar works very well, but that requires a thorough understanding of the source of a particular date to convert correctly. If you need to display the date as it was used originally, you should store and display a local date, or simply the original string, with no conversions on input or output. Of course, with two fields, you can support both at once.

CodePudding user response:

The difference is indeed related to different calendar systems, which the following code snippet illustrates:

// Default hybrid Julian/Gregorian calendar
GregorianCalendar cal = new GregorianCalendar();
cal.set(2, 1, 2, 0, 0, 0);
cal.set(Calendar.MILLISECOND, 0);
System.out.println(cal.getTimeInMillis());
// -62101468800000

// Pure proleptic Gregorian calendar
GregorianCalendar cal = new GregorianCalendar();
cal2.setGregorianChange(new Date(Long.MIN_VALUE));
cal2.set(2, 1, 2, 0, 0, 0);
cal2.set(Calendar.MILLISECOND, 0);
System.out.println(cal2.getTimeInMillis());
// -62101296000000

The default behavior of GregorianCalendar is:

a hybrid calendar that supports both the Julian and Gregorian calendar systems with the support of a single discontinuity, which corresponds by default to the Gregorian date when the Gregorian calendar was instituted (October 15, 1582 in some countries, later in others).

By using setGregorianChange we can change the system to be a pure proleptic Gregorian calendar:

To obtain a pure Gregorian calendar, set the change date to Date(Long.MIN_VALUE).

This gives the same -62101296000000 result as Python because it's the same way Python does calculations:

Like a date object, datetime assumes the current Gregorian calendar extended in both directions

Unfortunately, as you mentioned in the question, it appears that at this time the Postgres JDBC driver provides no way to adjust its default behavior as the passed in calendar to getTimestamp isn't fully used. So conversion would have to be done manually in your code afterwards if getTimestamp must be used.

  • Related