I'm having a hard time converting what is supposed to be a datetime column from an excel file. When opening it with pandas I get 44710.37680 instead of 5/29/2022 9:02:36. I tried this peace of code to convert it.
df = pd.read_excel(file,'Raw')
df.to_csv(finalfile, index = False)
df = pd.read_csv(finalfile)
df['First LogonTime'] = df['First LogonTime'].apply(lambda x: pd.Timestamp(x).strftime('%Y-%m-%d %H:%M:%S'))
print(df)
And the result I get is 1970-01-01 00:00:00 :c
Don't know if this helps but its an .xlsb file that I'm working with.
CodePudding user response:
You can use unit='d'
(for days) and substract 70 years:
pd.to_datetime(44710.37680, unit='d') - pd.DateOffset(years=70)
Result:
Timestamp('2022-05-30 09:02:35.520000')
For dataframes use:
import pandas as pd
df = pd.DataFrame({'First LogonTime':[44710.37680, 44757.00000]})
df['First LogonTime'] = pd.to_datetime(df['First LogonTime'], unit='d') - pd.DateOffset(years=70)
Or:
import pandas as pd
df = pd.DataFrame({'First LogonTime':[44710.37680, 44757.00000]})
df['First LogonTime'] = df['First LogonTime'].apply(lambda x: pd.to_datetime(x, unit='d') - pd.DateOffset(years=70))
Result:
First LogonTime
0 2022-05-30 09:02:35.520
1 2022-07-16 00:00:00.000