Home > Back-end >  Python : Subtract columns value in ascending order value of a column
Python : Subtract columns value in ascending order value of a column

Time:06-14

Have a dataframe mortgage_data with columns name mortgage_amount and month (in asceding order)

mortgage_amount_paid = 1000

mortgage_data:

name   mortgage_amount  month 
mark     400              1
mark     500              2
mark     200              3

How to deduct and update mortgage_amount in ascending order or month using mortgage_amount_paid row by row in a dataframe and add a column paid_status as yes if mortgage_amount_paid is fully deducted for that amount and no if not like this

if mortgage_amount_paid = 1000 mortgage_data:

name   mortgage_amount  month  mortgage_amount_updated  paid_status 
mark     400              1         0                     full
mark     500              2         0                     full
mark     200              3       100                     partial

ex:

if mortgage_amount_paid = 600

mortgage_data:

name   mortgage_amount  month  mortgage_amount_updated  paid_status 
mark     400              1         0                     full
mark     500              2       300                     partial
mark     200              3       200                     zero

tried this:

import numpy as np
mortgage_amount_paid = 1000
df['mortgage_amount_updated'] = np.where(mortgage_amount_paid - df['mortgage_amount'].cumsum() >=0 , 0, df['mortgage_amount'].cumsum() - mortgage_amount_paid)
df['paid_status'] = np.where(df['mortgage_amount_updated'],'full','partial')

CodePudding user response:

IIUC, you can use masks:

mortgage_amount_paid = 600

# amount saved - debt
m1 = df['mortgage_amount'].cumsum().sub(mortgage_amount_paid)
# is it positive?
m2 = m1>0
# is the previous month also positive?
m3 = m2.shift(fill_value=False)

df['mortgage_amount_updated'] = (m1.clip(0, mortgage_amount_paid)
                                   .mask(m3, df['mortgage_amount'])
                                 )
df['paid_status'] = np.select([m3, m2], ['zero', 'partial'], 'full')

output:

   name  mortgage_amount  month  mortgage_amount_updated paid_status
0  mark              400      1                        0        full
1  mark              500      2                      300     partial
2  mark              200      3                      200        zero

CodePudding user response:

Idea is the cumsum before partial should less than mortgage_amount_paid and there could be at most one partial

mortgage_amount_paid = 600

m = df['mortgage_amount'].cumsum()

df['paid_status'] = np.select(
    [m <= mortgage_amount_paid,
     (m > mortgage_amount_paid) & (m.shift() < mortgage_amount_paid)
     ],
    ['full', 'partial'],
    default='zero'
)
df['mortgage_amount_updated'] = np.select(
    [df['paid_status'].eq('full'),
     df['paid_status'].eq('partial')],
    [0, m-mortgage_amount_paid],
    default=df['mortgage_amount']
)
print(df)

   name  mortgage_amount  month paid_status  mortgage_amount_updated
0  mark              400      1        full                        0
1  mark              500      2     partial                      300
2  mark              200      3        zero                      200
  • Related