Given the pandas dataframe with three loans, I need to add to the dataframe the payments, where the payment amount is total loan amount / number of payments. If seq = 0, is the loan amount, else seq is the payment number. I can do this with iterrows()
however the dataframe is very large and I would like to find a different solution.
This is my attempt:
import pandas as pd
rows = [{'id': 'x1', 'seq': 0, 'amount': 2000, 'payments': 4 },
{'id': 'x2', 'seq': 0, 'amount': 4000, 'payments': 2 },
{'id': 'x3', 'seq': 0, 'amount': 9000, 'payments': 3 }]
df = pd.DataFrame(rows)
df2 = df.copy()
for index, row in df.iterrows():
num = range(row['payments'])
for i in num:
payment_amount = row['amount'] / row['payments']
row2 = {'id': row['id'], 'seq': i 1 , 'amount': payment_amount, 'payments': 0 }
df2 = df2.append(row2, ignore_index=True)
The result should be:
id seq amount payments
0 x1 0 2000.0 4
1 x2 0 4000.0 2
2 x3 0 9000.0 3
3 x1 1 500.0 0
4 x1 2 500.0 0
5 x1 3 500.0 0
6 x1 4 500.0 0
7 x2 1 2000.0 0
8 x2 2 2000.0 0
9 x3 1 3000.0 0
10 x3 2 3000.0 0
11 x3 3 3000.0 0
However without using iterrows()
. Is this possible?
CodePudding user response:
Update:
df_pay = df.iloc[df.index.repeat(df['payments'])]\
.eval('amount = amount / payments')\
.assign(payments=0)
df_pay['seq'] = df_pay.groupby('id').cumcount() 1
pd.concat([df, df_pay], ignore_index=True)
Output:
id seq amount payments
0 x1 0 2000.0 4
1 x2 0 4000.0 2
2 x3 0 9000.0 3
3 x1 1 500.0 0
4 x1 2 500.0 0
5 x1 3 500.0 0
6 x1 4 500.0 0
7 x2 1 2000.0 0
8 x2 2 2000.0 0
9 x3 1 3000.0 0
10 x3 2 3000.0 0
11 x3 3 3000.0 0
Try this:
pd.concat([df,
df.iloc[df.index.repeat(df['payments'])]\
.eval('amount = amount / payments')\
.assign(payments=0)])
Output:
id seq amount payments
0 x1 0 2000.0 4
1 x2 0 4000.0 2
2 x3 0 9000.0 3
0 x1 0 500.0 0
0 x1 0 500.0 0
0 x1 0 500.0 0
0 x1 0 500.0 0
1 x2 0 2000.0 0
1 x2 0 2000.0 0
2 x3 0 3000.0 0
2 x3 0 3000.0 0
2 x3 0 3000.0 0
Trick using pd.Index.repeat
to generate payment records.