I need to transform a date, expressed as a number of seconds since 2000-01-01T00:00:00, to a pandas.Timestamp
with a resolution of 1 ns.
I have found two options:
- Use:
pandas.to_datetime(VALUE, unit='s', epoch=pandas.Timestamp(2000, 1, 1))
- Use:
epoch=pandas.Timestamp(2000, 1, 1) pandas.to_timedelta(VALUE, unit='sec')
I was expecting the both of them provide the same result but the results are slightly different, e.g.:
In [2]: Y2K = pandas.Timestamp(2000, 1, 1)
...:
...: s = 538121125.6849735
...:
...: t1 = pandas.to_datetime(s, unit='s', origin=Y2K)
...: t2 = Y2K pandas.to_timedelta(s, unit='sec')
...:
...: t1 - t2
Out[2]: Timedelta('0 days 00:00:00.000000090')
Am I doing something wrong? Can, this discrepancy, be considered as a bug?
Which is the more correct way to execute this task? Please note that I need a resolution up to 1 ns.
CodePudding user response:
I wouldn't say that's a bug, that's just an incorrect use of pandas.to_datetime
method. The second method you proposed seems to be the proper one. It is more accurate because it takes into account the fact that Timestamp is a combination of a date and a time, whereas the first method only takes the date component into account.
CodePudding user response:
Float (double precision) can store only 15 digits (15.9). You use 9 for the integer part, so you can expect only 7 digit precision on decimal part. And you get it as expected.
In any case, do you expect so much precision for any clocks?
As @ChrisQ267 mentioned in the other answer, programs tend to store times in different components, so you have more precision. Common is either date and time in two fields, or second and the decimal part of second fields. So float
is not so ideal for high precision timestamps.
But in any case both methods you are using are not precise: both misses leap seconds, so the real result is already off by several seconds (not just the 8 decimal place).