Home > Software design >  Getting 1970-01-01 after converting int to datetime
Getting 1970-01-01 after converting int to datetime

Time:05-08

I have an attribute like:

df. CalculationDateKey.head()
0    20201231
1    20201130
2    20201031
3    20200930
4    20200831
Name: CalculationDateKey, dtype: int64

And I want to convert it into datetime.

I tried:

pd.to_datetime(df['CalculationDateKey']).head()

which yields:

0   1970-01-01 00:00:00.020201231
1   1970-01-01 00:00:00.020201130
2   1970-01-01 00:00:00.020201031
3   1970-01-01 00:00:00.020200930
4   1970-01-01 00:00:00.020200831
Name: CalculationDateKey, dtype: datetime64[ns]

I want this, so I can calculate difference in months between two dates.

CodePudding user response:

Don't let Pandas infer your date format so specify it:

>>> pd.to_datetime(df['CalculationDateKey'], format='%Y%m%d')
0   2020-12-31
1   2020-11-30
2   2020-10-31
3   2020-09-30
4   2020-08-31
Name: CalculationDateKey, dtype: datetime64[ns]

CodePudding user response:

When you try to convert integers to dates using to_datetime, Pandas default unit is nanoseconds (since January 1, 1970). That's why you're getting values like 1970-01-01 00:00:00.020201231 as a result.

As Corralien mentioned in their answer, you can pass a date format string to let Pandas know the correct way to interpret those YYYYmmdd values.

pd.to_datetime(df['CalculationDateKey'],  format='%Y%m%d')
  • Related