I have an excel file that I want to open with pandas which contains Dates like that :
but after I did my upload, I can't recover any of these dates : Pandas is converting my dates as timestamp.
df = pd.read_excel(excelfile)
df['Dates']
0 44327
1 44340
2 44343
3 44385
4 44341
pd.to_datetime(df['Dates'], errors='coerce', unit='D')
0 2091-05-13
1 2091-05-26
2 2091-05-29
3 2091-07-10
4 2091-05-27
Even after pd.to_datetime, this is the wrong year and day. If I take off unit='D', I obtain :
pd.to_datetime(df['Dates'], errors='coerce')
0 1970-01-01 00:00:00.000044327
1 1970-01-01 00:00:00.000044340
2 1970-01-01 00:00:00.000044343
3 1970-01-01 00:00:00.000044385
4 1970-01-01 00:00:00.000044341
Any thoughts ?
CodePudding user response:
I've found a solution : the timestamps were the number of days since 01/01/1900 so I needed to add an origin like :
pd.to_datetime(pd.to_numeric(df['Dates'], errors='coerce'),unit='D',origin='1899-12-30')
You can also use the xlrd package (see the link) :
date = xlrd.xldate_as_datetime(xl_date,0)
https://www.geeksforgeeks.org/python-convert-excel-serial-date-to-datetime/