Home > Blockchain >  pd.to_datetime returning wrong Year and wrong Day
pd.to_datetime returning wrong Year and wrong Day

Time:07-03

I have an excel file that I want to open with pandas which contains Dates like that :

enter image description here

but after I did my upload, I can't recover any of these dates : Pandas is converting my dates as timestamp.

df = pd.read_excel(excelfile)

df['Dates']
0      44327
1      44340
2      44343
3      44385
4      44341

pd.to_datetime(df['Dates'], errors='coerce', unit='D')

0     2091-05-13
1     2091-05-26
2     2091-05-29
3     2091-07-10
4     2091-05-27

Even after pd.to_datetime, this is the wrong year and day. If I take off unit='D', I obtain :

pd.to_datetime(df['Dates'], errors='coerce')
0     1970-01-01 00:00:00.000044327
1     1970-01-01 00:00:00.000044340
2     1970-01-01 00:00:00.000044343
3     1970-01-01 00:00:00.000044385
4     1970-01-01 00:00:00.000044341

Any thoughts ?

CodePudding user response:

I've found a solution : the timestamps were the number of days since 01/01/1900 so I needed to add an origin like :

pd.to_datetime(pd.to_numeric(df['Dates'], errors='coerce'),unit='D',origin='1899-12-30')

You can also use the xlrd package (see the link) :

date = xlrd.xldate_as_datetime(xl_date,0)

https://www.geeksforgeeks.org/python-convert-excel-serial-date-to-datetime/

  • Related