I have df1
as
Acc_id Acc_name Acc_end_date
qqq-1 test1 12/31/2021
www-2 test2 05/28/2022
yyy-6 test3 06/15/2022
zzz-6 test4 06/17/2022
kkk-6 test5 03/16/2022
The Acc_end_date
is in mm/dd/yy format.
I am trying to get only those accounts where Acc_end_date
- todays date >30.
Meaning, 30 days should pass as of today for the Acc_id
to be selected.
I tried the below
# changing the column to datetime format
df1['Acc_end_date']= pd.to_datetime(df1['Acc_end_date'])
# getting todays date and creating an array with those number of Acc_id and substracting
todays_date = []
for i in range(df1.shape[0]):
temp = date.today()
todays_date.append(temp)
np.array(todays_date)
df2 = df1[df1['Acc_end_date'] - todays_date>30]
I am getting the error as
TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'list'
gives error as
TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'datetime.date'
Trying this
df2 = df1[only_inactive['Acc_end_date'] - date.today()>30]
I am unable to get the date right and in the right format for the substraction.
Please help with any approaches
CodePudding user response:
IIUC, you can use:
df2 = df1[pd.to_datetime(df1['Acc_end_date'], dayfirst=False)
.rsub(pd.Timestamp('today')).gt('30d')]
output:
Acc_id Acc_name Acc_end_date
0 qqq-1 test1 12/31/2021
4 kkk-6 test5 03/16/2022