Have a dataframe mortgage_data with columns name mortgage_amount and month (in asceding order)
mortgage_amount_paid = 1000
mortgage_data:
name mortgage_amount month
mark 400 1
mark 500 2
mark 200 3
How to deduct and update mortgage_amount
in ascending order or month using mortgage_amount_paid
row by row in a dataframe
and add a column paid_status
as yes if mortgage_amount_paid is fully deducted for that amount and no if not like this
if mortgage_amount_paid = 1000
mortgage_data:
name mortgage_amount month mortgage_amount_updated paid_status
mark 400 1 0 full
mark 500 2 0 full
mark 200 3 100 partial
ex:
if mortgage_amount_paid = 600
mortgage_data:
name mortgage_amount month mortgage_amount_updated paid_status
mark 400 1 0 full
mark 500 2 300 partial
mark 200 3 200 zero
tried this:
import numpy as np
mortgage_amount_paid = 1000
df['mortgage_amount_updated'] = np.where(mortgage_amount_paid - df['mortgage_amount'].cumsum() >=0 , 0, df['mortgage_amount'].cumsum() - mortgage_amount_paid)
df['paid_status'] = np.where(df['mortgage_amount_updated'],'full','partial')
CodePudding user response:
IIUC, you can use masks:
mortgage_amount_paid = 600
# amount saved - debt
m1 = df['mortgage_amount'].cumsum().sub(mortgage_amount_paid)
# is it positive?
m2 = m1>0
# is the previous month also positive?
m3 = m2.shift(fill_value=False)
df['mortgage_amount_updated'] = (m1.clip(0, mortgage_amount_paid)
.mask(m3, df['mortgage_amount'])
)
df['paid_status'] = np.select([m3, m2], ['zero', 'partial'], 'full')
output:
name mortgage_amount month mortgage_amount_updated paid_status
0 mark 400 1 0 full
1 mark 500 2 300 partial
2 mark 200 3 200 zero
CodePudding user response:
Idea is the cumsum
before partial
should less than mortgage_amount_paid
and there could be at most one partial
mortgage_amount_paid = 600
m = df['mortgage_amount'].cumsum()
df['paid_status'] = np.select(
[m <= mortgage_amount_paid,
(m > mortgage_amount_paid) & (m.shift() < mortgage_amount_paid)
],
['full', 'partial'],
default='zero'
)
df['mortgage_amount_updated'] = np.select(
[df['paid_status'].eq('full'),
df['paid_status'].eq('partial')],
[0, m-mortgage_amount_paid],
default=df['mortgage_amount']
)
print(df)
name mortgage_amount month paid_status mortgage_amount_updated
0 mark 400 1 full 0
1 mark 500 2 partial 300
2 mark 200 3 zero 200