I'm currently working on an application that has the unfortunate requirement of working on complex localized dates and times.
As a simple example, if an event were t happen "today" in Singapore, this is fairly easy to represent: we store the date in UTC, the IANA timezone Asia/Singapore
, perhaps also the effective UTC offset at the given timestamp (e.g. 08:00
) so as to not have to consult the IANA database every time we render them.
If you aren't familiar, dealing with timezones is absolutely insane. We can't just assume that Singapore is always 08:00:
- Daylight savings time may or may not happen, and different locales start and end DST on different calendar days, and some locales may offset by more than or less than one hour.
- Over time, DST and the actual UTC offset can change. Made-up examples:
- As a made-up example, in 1971, the start and end dates for DST changed to March 31st and October 1st, respectively, rather than February 27th and September 16th.
- In 1933, the DST offset was changed from one hour to one hour and thirty minutes.
- Yes, these things do absolutely happen and the IANA database covers them on a per-timezone-locale basis, which is why we need to store the UTC offset for the given datetime and the relevant timezone identifier.
- Where things get even worse is when actual calendars in use have changed over time, locales adopted the Gregorian/ISO calendar at different times, and as a result, they had to skip up to a couple weeks of days during the transition.
- An actual example is in the Soviet Union in 1918: prior to January 31st, 1918, Russia and other nations used the Julian calendar, which was off from the Gregorian calendar by fourteen days, so a train ride from Russia to Europe that only takes a few hours can result in the current date moving forward by two weeks. When it actually was changed, January 31st was the last day in the Julian calendar and the very next day was February 14th in the Gregorian calendar.
- In notation, dates from before this transition were accompanied by a specifier of old-style (O.S.) or new-style (N.S.) dates.
Therefore, in order to properly represent dates before/after these transitions, we must store:
- The datetime as an RFC-3339 timestamp in UTC in the Gregorian/ISO calendar.
- The named timezone closest to where the specific date was relevant, e.g.
Asia/Singapore
: this means that we also have to collect a location with the date that we can hopefully use to select one of these named timezones. - The UTC offset for the specific datetime according to the named timezone, e.g.
08:00
. - An optional other calendar (which I'll call a calendar "projection") which was in use in the given locale during the datetime in question, so that dates can be represented in both calendars to further clarify things and to provide better accuracy.
As noted above, the IANA database does provide a complex database for each timezone to keep a historically accurate timeline of UTC offset changes and DST changes. In Java and other programming languages, datetime libraries use this database to perform conversions between UTC and a local time by the named timezone.
What I need, however, is hopefully a similar database which can be used to know which calendar was in use for a given timezone or locale so that I can offer a "projection" to the calendar locally in use. I could write my own system for this, incorporating data that I'm able to use to offer these projections, but as everyone knows time is very hard and I do not want to engage in a historical study of calendars to make my own set of rules.
Another problem seems to be finding the right named timezone for a given general geographical location. During and after wars, different geographical locations changed hands, became their own countries, etc. In 1917, the capital of Russia was Petrograd (subsequently Leningrad, subsequently St. Petersburg), but at some point this changed to Moscow. If I have a given general geographic area (e.g. "Kiev" or "Ukraine"), I'll need to try to associate that city with a named timezone somehow, and how do I do that? Do I do a geographical search for the nearest named timezone to an arbitrary city that is within the same latitude?
In summary:
- Does an IANA database exist which tracks when different calendars were in use for a geographical area?
- If I have a geograpical area for a given city or country, how can I figure out which named timezone to use for it?
CodePudding user response:
A reminder to the reader about definitions:
- An offset from UTC is merely a number of hours-minutes-seconds ahead or behind UTC. Modern protocols usually refer to positive numbers being ahead of UTC and negative numbers being behind UTC. But some protocols do the opposite, so beware.
- A time zone is much more. A time zone is a named history of the past, present, and future changes to the offset used by the people of a particular region as decided by their politicians.
so as to not have to consult the IANA database every time we render them.
Beware: Politicians change the time zone rules. They do so with surprising frequency, and even more surprisingly little forewarning. This happens across cultures and continents, where a wide array of politicians have shown a penchant for twiddling with the time zone rules.
I recommend against pre-calculating an offset from UTC. I recommend storing a moment, a specific point on the timeline, in UTC (an offset from UTC of zero hours-minutes-seconds). For presentation to the user, or where business logic demands, dynamically adjust into a time zone.
If you aren't familiar, dealing with timezones is absolutely insane.
Not really “insane”, but yes, amazingly tricky, counter-intuitive, and error-prone.
We can't just assume that Singapore is always 08:00
No, you cannot. As I said above, offsets and time zones are political time, defined by fickle politicians.
Daylight savings time may or may not happen, and different locales start and end DST on different calendar days, and some locales may offset by more than or less than one hour.
Yes, politicians often alter the start & stop dates for Daylight Saving Time (DST).
Yes, these things do absolutely happen
Yes, politicians invent all kinds of adjustments, sometimes quite wacky and senseless.
The newest fad is going onto DST and never stopping, an everlasting DST. So then never again will the sun be directly overhead at noon — defying the very definition of noon.
the DST offset was changed from one hour to one hour and thirty minutes.
Politicians are free to change the current offset within their time zone(s) by any amount they fancy, any amount of hours-minutes-seconds.
locales adopted the Gregorian/ISO calendar at different times
Avoid the word “locale” when talking about date-time handling. That word has a specific meaning in localization work. And many developers mistakenly think locale and time zones are related, but they are not. Time zones are tied to legal jurisdictions under the control of specific politicians; locales are not.
Do not conflate Gregorian calendar with ISO 8601 calendar. For example, ISO 8601 mandates that weeks start on a Monday. Various Gregorian calendar implementations may use a different day-of-week. And furthermore, this means different week numbers under each calendar system.
The fact that various time zones adopted calendar systems at different points is not a big problem as far as I know. Those changes are accounted for in the IANA database, also known as the tz database, formerly known as the Olson Database. But I may be wrong about that, as I do date-time handling only for contemporary times.