I have two fields in a dataframe, both of which are datetime64[ns]
I thought I could just do this...
df_hist['MonthsBetween'] = (df_hist.Last_Date - df_hist.Begin_Time) / pd.Timedelta(months=1)
One field has only a data and one has a date/time, but both are of datatype datetime64[ns]. I Googled this and it seems like it should work, but I'm getting an error message saying:
TypeError: '<' not supported between instances of 'str' and 'int'
I thought these were both datetime64[ns], and neither str or int.
CodePudding user response:
You can use to_period to take months and then subtract
df_hist['MonthsBetween'] = df_hist['Last_Date'].dt.to_period('M').astype(int) - df_hist['Begin_Time'].dt.to_period('M').astype(int)
df
Out[123]:
Last_Date Begin_Time Months
0 2022-01-01 00:00:00 2022-03-01 2
1 2022-01-01 12:00:00 2022-03-02 2
2 2022-01-02 00:00:00 2022-03-03 2
3 2022-01-02 12:00:00 2022-03-04 2
4 2022-01-03 00:00:00 2022-03-05 2
5 2022-01-03 12:00:00 2022-03-06 2
6 2022-01-04 00:00:00 2022-03-07 2
7 2022-01-04 12:00:00 2022-03-08 2
dtypes of date columns should be datetime64[ns]
df_hist.dtypes
Out[125]:
Last_Date datetime64[ns]
Begin_Time datetime64[ns]
Months int64
dtype: object