I have the following dataframe with some dates (datetime64[ns]):
A6U20 A6Z20 A6H21 A6M21 A6U21 A6Z21
expiry 2020-09-14 2020-12-14 2021-03-15 2021-06-14 2021-09-13 2021-12-13
I want to create another dataframe with the number of days between each column and the preceding one. I tried the following:
df2 = (df - df.shift(axis=1))
which resulted in:
A6U20 A6Z20 A6H21 A6M21 A6U21 A6Z21
expiry NaT 91 days 91 days 91 days 91 days 91 days
which is fine, except that I don't need the "days" suffix at the end of each timedelta value. After reading some similar questions, I tried the following command:
df2 = (df - df.shift(axis=1)).dt.days
which resulted in the following error: 'DataFrame' object has no attribute 'dt'.
What am I doing wrong?
CodePudding user response:
The dt
attribute is for objects of type Series, not DataFrame.
If you just want the number of days instead of a pandas.Timedelta object, you can do this:
df2 = (df - df.shift(axis=1))
df2.loc['expiry',:] = df2.loc['expiry',:].dt.days
Output:
A6U20 A6Z20 A6H21 A6M21 A6U21 A6Z21
expiry NaT 91.0 91.0 91.0 91.0 91.0
If you would like the NaT
to be replaced with 0
days, you can use fillna()
:
df2 = (df - df.shift(axis=1)).fillna(pd.Timedelta(0))
df2.loc['expiry',:] = df2.loc['expiry',:].dt.days
Output:
A6U20 A6Z20 A6H21 A6M21 A6U21 A6Z21
expiry 0 91 91 91 91 91
CodePudding user response:
I figured out dt works only on series and not on dataframes. The following apply function did the trick:
df2 = df2.apply(lambda x : x.dt.days)