Theory and pragmatics of the tz code and data

Outline

Scope of the tz database

The tz database attempts to record the history and predicted future of all computer-based clocks that track civil time. It organizes time zone and daylight saving time data by partitioning the world into timezones whose clocks all agree about timestamps that occur after the POSIX Epoch (1970-01-01 00:00:00 UTC). The database labels each timezone with a notable location and records all known clock transitions for that location. Although 1970 is a somewhat-arbitrary cutoff, there are significant challenges to moving the cutoff earlier even by a decade or two, due to the wide variety of local practices before computer timekeeping became prevalent.

Each timezone typically corresponds to a geographical region that is smaller than a traditional time zone, because clocks in a timezone all agree after 1970 whereas a traditional time zone merely specifies current standard time. For example, applications that deal with current and future timestamps in the traditional North American mountain time zone can choose from the timezones America/Denver which observes US-style daylight saving time, America/Mazatlan which observes Mexican-style DST, and America/Phoenix which does not observe DST. Applications that also deal with past timestamps in the mountain time zone can choose from over a dozen timezones, such as America/Boise, America/Edmonton, and America/Hermosillo, each of which currently uses mountain time but differs from other timezones for some timestamps after 1970.

Clock transitions before 1970 are recorded for each timezone, because most systems support timestamps before 1970 and could misbehave if data entries were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping. Although some information outside the scope of the database is collected in a file backzone that is distributed along with the database proper, this file is less reliable and does not necessarily follow database guidelines.

As described below, reference source code for using the tz database is also available. The tz code is upwards compatible with POSIX, an international standard for UNIX-like systems. As of this writing, the current edition of POSIX is: The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2017, 2018 Edition. Because the database's scope encompasses real-world changes to civil timekeeping, its model for describing time is more complex than the standard and daylight saving times supported by POSIX. A tz timezone corresponds to a ruleset that can have more than two changes per year, these changes need not merely flip back and forth between two alternatives, and the rules themselves can change at times. Whether and when a timezone changes its clock, and even the timezone's notional base offset from UTC, are variable. It does not always make sense to talk about a timezone's "base offset", which is not necessarily a single number.

Names of timezones

Each timezone has a unique name. Inexperienced users are not expected to select these names unaided. Distributors should provide documentation and/or a simple selection interface that explains each name via a map or via descriptive text like "Ruthenia" instead of the timezone name "Europe/Uzhgorod". If geolocation information is available, a selection interface can locate the user on a timezone map or prioritize names that are geographically close. For an example selection interface, see the tzselect program in the tz code. The Unicode Common Locale Data Repository contains data that may be useful for other selection interfaces; it maps timezone names like Europe/Uzhgorod to CLDR names like uauzh which are in turn mapped to locale-dependent strings like "Uzhhorod", "Ungvár", "Ужгород", and "乌日哥罗德".

The naming conventions attempt to strike a balance among the following goals:

Names normally have the form AREA/LOCATION, where AREA is a continent or ocean, and LOCATION is a specific location within the area. North and South America share the same area, 'America'. Typical names are 'Africa/Cairo', 'America/New_York', and 'Pacific/Honolulu'. Some names are further qualified to help avoid confusion; for example, 'America/Indiana/Petersburg' distinguishes Petersburg, Indiana from other Petersburgs in America.

Here are the general guidelines used for choosing timezone names, in decreasing order of importance:

The file 'zone1970.tab' lists geographical locations used to name timezones. It is intended to be an exhaustive list of names for geographic regions as described above; this is a subset of the timezones in the data. Although a 'zone1970.tab' location's longitude corresponds to its local mean time (LMT) offset with one hour for every 15° east longitude, this relationship is not exact.

Older versions of this package used a different naming scheme, and these older names are still supported. See the file 'backward' for most of these older names (e.g., 'US/Eastern' instead of 'America/New_York'). The other old-fashioned names still supported are 'WET', 'CET', 'MET', and 'EET' (see the file 'europe').

Older versions of this package defined legacy names that are incompatible with the first guideline of location names, but which are still supported. These legacy names are mostly defined in the file 'etcetera'. Also, the file 'backward' defines the legacy names 'GMT0', 'GMT-0' and 'GMT+0', and the file 'northamerica' defines the legacy names 'EST5EDT', 'CST6CDT', 'MST7MDT', and 'PST8PDT'.

Excluding 'backward' should not affect the other data. If 'backward' is excluded, excluding 'etcetera' should not affect the remaining data.

Time zone abbreviations

When this package is installed, it generates time zone abbreviations like 'EST' to be compatible with human tradition and POSIX. Here are the general guidelines used for choosing time zone abbreviations, in decreasing order of importance:

Application writers should note that these abbreviations are ambiguous in practice: e.g., 'CST' means one thing in China and something else in North America, and 'IST' can refer to time in India, Ireland or Israel. To avoid ambiguity, use numeric UT offsets like '-0600' instead of time zone abbreviations like 'CST'.

Accuracy of the tz database

The tz database is not authoritative, and it surely has errors. Corrections are welcome and encouraged; see the file CONTRIBUTING. Users requiring authoritative data should consult national standards bodies and the references cited in the database's comments.

Errors in the tz database arise from many sources:

In short, many, perhaps most, of the tz database's pre-1970 and future timestamps are either wrong or misleading. Any attempt to pass the tz database off as the definition of time should be unacceptable to anybody who cares about the facts. In particular, the tz database's LMT offsets should not be considered meaningful, and should not prompt creation of timezones merely because two locations differ in LMT or transitioned to standard time at different dates.

Time and date functions

The tz code contains time and date functions that are upwards compatible with those of POSIX. Code compatible with this package is already part of many platforms, where the primary use of this package is to update obsolete time-related files. To do this, you may need to compile the time zone compiler 'zic' supplied with this package instead of using the system 'zic', since the format of zic's input is occasionally extended, and a platform may still be shipping an older zic.

POSIX properties and limitations

Extensions to POSIX in the tz code

POSIX features no longer needed

POSIX and ISO C define some APIs that are vestigial: they are not needed, and are relics of a too-simple model that does not suffice to handle many real-world timestamps. Although the tz code supports these vestigial APIs for backwards compatibility, they should be avoided in portable applications. The vestigial APIs are:

Other portability notes

Interface stability

The tz code and data supply the following interfaces:

Interface changes in a release attempt to preserve compatibility with recent releases. For example, tz data files typically do not rely on recently-added zic features, so that users can run older zic versions to process newer data files. Downloading the tz database describes how releases are tagged and distributed.

Interfaces not listed above are less stable. For example, users should not rely on particular UT offsets or abbreviations for timestamps, as data entries are often based on guesswork and these guesses may be corrected or improved.

Calendrical issues

Calendrical issues are a bit out of scope for a time zone database, but they indicate the sort of problems that we would run into if we extended the time zone database further into the past. An excellent resource in this area is Edward M. Reingold and Nachum Dershowitz, Calendrical Calculations: The Ultimate Edition, Cambridge University Press (2018). Other information and sources are given in the file 'calendars' in the tz distribution. They sometimes disagree.

Time and time zones on other planets

Some people's work schedules use Mars time. Jet Propulsion Laboratory (JPL) coordinators kept Mars time on and off during the Mars Pathfinder mission. Some of their family members also adapted to Mars time. Dozens of special Mars watches were built for JPL workers who kept Mars time during the Mars Exploration Rovers mission (2004). These timepieces look like normal Seikos and Citizens but use Mars seconds rather than terrestrial seconds.

A Mars solar day is called a "sol" and has a mean period equal to about 24 hours 39 minutes 35.244 seconds in terrestrial time. It is divided into a conventional 24-hour clock, so each Mars second equals about 1.02749125 terrestrial seconds.

The prime meridian of Mars goes through the center of the crater Airy-0, named in honor of the British astronomer who built the Greenwich telescope that defines Earth's prime meridian. Mean solar time on the Mars prime meridian is called Mars Coordinated Time (MTC).

Each landed mission on Mars has adopted a different reference for solar timekeeping, so there is no real standard for Mars time zones. For example, the Mars Exploration Rover project (2004) defined two time zones "Local Solar Time A" and "Local Solar Time B" for its two missions, each zone designed so that its time equals local true solar time at approximately the middle of the nominal mission. Such a "time zone" is not particularly suited for any application other than the mission itself.

Many calendars have been proposed for Mars, but none have achieved wide acceptance. Astronomers often use Mars Sol Date (MSD) which is a sequential count of Mars solar days elapsed since about 1873-12-29 12:00 GMT.

In our solar system, Mars is the planet with time and calendar most like Earth's. On other planets, Sun-based time and calendars would work quite differently. For example, although Mercury's sidereal rotation period is 58.646 Earth days, Mercury revolves around the Sun so rapidly that an observer on Mercury's equator would see a sunrise only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury day. Venus is more complicated, partly because its rotation is slightly retrograde: its year is 1.92 of its days. Gas giants like Jupiter are trickier still, as their polar and equatorial regions rotate at different rates, so that the length of a day depends on latitude. This effect is most pronounced on Neptune, where the day is about 12 hours at the poles and 18 hours at the equator.

Although the tz database does not support time on other planets, it is documented here in the hopes that support will be added eventually.

Sources for time on other planets:

mirror server hosted at Truenetwork, Russian Federation.