I have a Statement of accounts, where i have Unique ID, Disbursed date, payment date and the balance amount.
Date range for below data = Disbursed date to May-2022
Example of date:
Unique Disbursed date payment date balance amount
123 2022-Jan-13 2022-Jan-27 10,000
123 2022-Jan-13 2022-Feb-28 5,000
123 2022-Jan-13 2022-Apr-29 2,000
first I want to groupby payment date(last day of each month) and as an aggr function instead of Sum or mean, I want to carry forward the same balance reflecting in the last month last day.
As you can see March is missing in the records, here I want to add a new record for March with same balance given in Feb-22 i.e 5,000 and date for the new record should be last day of Mar-22.
Since date range given till 2022-May then here I want to add another new record for May-22 with same balance given in last month (Apr-22) i.e 2000 and date for the new record should be last day of May-22
Note : I have multiple unique ids like 123, 456, 789, etc.
I'd tried below code to find out the missing month
for i in df['date']: pd.date_range(i,'2020-11-28').difference(df.index) print(i)
but, it is giving days wise missing date. I want to find out the missing "month" instead of date for each unique id
CodePudding user response:
You can use:
# generate needed month ends
idx = pd.date_range('2022-01', '2022-06', freq='M')
out = (df
# compute the month end for existing data
.assign(month_end=pd.to_datetime(df['payment date'])
.sub(pd.Timedelta('1d'))
.add(pd.offsets.MonthEnd()))
.set_index(['Unique', 'month_end'])
# reindex with missing ID/month ends
.reindex(pd.MultiIndex.from_product([df['Unique'].unique(),
idx
], names=['Unique', 'idx']))
.reset_index()
# fill missing month end with correct format
.assign(**{'payment date': lambda d:
d['payment date'].fillna(d['idx'].dt.strftime('%Y-%b-%d'))})
# ffill the data per ID
.groupby('Unique').ffill()
)
output:
Unique idx Disbursed date payment date balance amount
0 123 2022-01-31 2022-Jan-13 2022-Jan-27 10,000
1 123 2022-02-28 2022-Jan-13 2022-Feb-28 5,000
2 123 2022-03-31 2022-Jan-13 2022-Mar-31 5,000
3 123 2022-04-30 2022-Jan-13 2022-Apr-29 2,000
4 123 2022-05-31 2022-Jan-13 2022-May-31 2,000