Home > Blockchain >  converting datetime column to total hours and offsetting from reference time
converting datetime column to total hours and offsetting from reference time

Time:08-06

I have the following datetime object:

import pandas as pd
from datetime import datetime
t0=datetime.strptime("01/01/2011 00:00:00", "%d/%m/%Y %H:%M:%S")

here, t0 is my reference or start time of simulation. I wanted to convert it into total hours (but failed) so that I can add them to my Hours df column and finally convert into a datetime column that could start from 2021-01-01.

I have a following Hours column which calculates hours from the start time t0:

My model results in hours: 

    Hours
0   44317.0
1   44317.250393519
2   44317.500138889
3   44317.750462963
4   44318.00005787
5   44318.250266204
6   44318.500543981
7   44318.7503125
8   44319.000520833
9   44319.250729167
10  44319.500428241

In excel if I convert this hours into date format it becomes 2021-05-01, like this which is my expected output:

My expected output:



Hours
1   5/1/21 0:00
2   5/1/21 6:00
3   5/1/21 12:00
4   5/1/21 18:00
5   5/2/21 0:00
6   5/2/21 6:00
7   5/2/21 12:00
8   5/2/21 18:00
9   5/3/21 0:00
10  5/3/21 0:00

However, in python if I can converted this Hours column into a datetime column named date using pd.to_datetime(df.Hours)` it starts from 1970-01-01.

My python output which I don't want:

    Hours
0   1970-01-01 00:00:00.000044317
1   1970-01-01 00:00:00.000044317
2   1970-01-01 00:00:00.000044317
3   1970-01-01 00:00:00.000044317
4   1970-01-01 00:00:00.000044318
5   1970-01-01 00:00:00.000044318
6   1970-01-01 00:00:00.000044318
7   1970-01-01 00:00:00.000044318
8   1970-01-01 00:00:00.000044319
9   1970-01-01 00:00:00.000044319
10  1970-01-01 00:00:00.000044319

Please let me know how to convert it so that it starts from 1st May, 2021.

Solution: From Michael S. answere below: The Hours column is actually not hours but days and using pd.to_datetime(df.Hours, unit='d',origin='1900-01-01') will give the right results. The software that I am using also uses excel like epoch time of '1900-01-01' and mistakenly says the days as hours.

CodePudding user response:

Here is an update to the answer with OP's edits and inputs. Excel is weird with dates, so if you have to convert your timestamps (44317 etc) to Excel's dates, you have to do some odd additions to put the dates in line with Excel's (Pandas and Excel have different "Start of Time" dates, that's why you are seeing the different values e.g. 1970 vs 2021). Your 44317 etc numbers are actually days and you have to add 1899-12-30 to those days:

hours = [44317.0, 44317.250393519, 44317.500138889, 44317.750462963, 
         44318.00005787, 44318.250266204, 44318.500543981, 44318.7503125, 
         44319.000520833, 44319.250729167, 44319.500428241]
df = pd.DataFrame({"Hours":hours})

t0=datetime.strptime("01/01/2011 00:00:00", "%d/%m/%Y %H:%M:%S")
df["Actual Date"] = pd.TimedeltaIndex(df['Hours'], unit='d')   datetime(1899, 12, 30)
# Alternateive is   pd.to_datetime(df.Hours, unit='d', origin='1899-12-30')

Output:

Hours   Actual Date
0   44317.000000    2021-05-01 00:00:00.000000000
1   44317.250394    2021-05-01 06:00:34.000041600
2   44317.500139    2021-05-01 12:00:12.000009600
3   44317.750463    2021-05-01 18:00:40.000003200
4   44318.000058    2021-05-02 00:00:04.999968000
5   44318.250266    2021-05-02 06:00:23.000025600
6   44318.500544    2021-05-02 12:00:46.999958400
7   44318.750313    2021-05-02 18:00:27.000000000
8   44319.000521    2021-05-03 00:00:44.999971199
9   44319.250729    2021-05-03 06:01:03.000028799
10  44319.500428    2021-05-03 12:00:37.000022400

There are ways to clean up the format, but this is the correct time as you wanted.

To match your output exactly, you can do this, just be aware that the contents of the cells in the column "Corrected Format" are now string values and not datetime values. If you want to use them as datetime values then you'll have to convert them back again:

df["Corrected Format"] = df["Actual Date"].dt.strftime("%d/%m/%Y %H:%M")

Output


    Hours                   Actual Date             Corrected Format
0   44317.000000    2021-05-01 00:00:00.000000000   01/05/2021 00:00
1   44317.250394    2021-05-01 06:00:34.000041600   01/05/2021 06:00
2   44317.500139    2021-05-01 12:00:12.000009600   01/05/2021 12:00
3   44317.750463    2021-05-01 18:00:40.000003200   01/05/2021 18:00
4   44318.000058    2021-05-02 00:00:04.999968000   02/05/2021 00:00
5   44318.250266    2021-05-02 06:00:23.000025600   02/05/2021 06:00
6   44318.500544    2021-05-02 12:00:46.999958400   02/05/2021 12:00
7   44318.750313    2021-05-02 18:00:27.000000000   02/05/2021 18:00
8   44319.000521    2021-05-03 00:00:44.999971199   03/05/2021 00:00
9   44319.250729    2021-05-03 06:01:03.000028799   03/05/2021 06:01
10  44319.500428    2021-05-03 12:00:37.000022400   03/05/2021 12:00
  • Related