I have the following table:
ID | Qty_1 | Qty_2 |
---|---|---|
A | 1 | 10 |
A | 2 | 0 |
A | 3 | 0 |
B | 3 | 29 |
B | 2 | 0 |
B | 1 | 0 |
I want to iterate based on the ID, and subtract Qty_2 - Qty_1 and update the next row with that result.
The result would be:
ID | Qty_1 | Qty_2 |
---|---|---|
A | 1 | 10 |
A | 2 | 8 |
A | 3 | 5 |
B | 3 | 29 |
B | 2 | 27 |
B | 1 | 26 |
Ideally, I would also like to start by subtracting the first row end a new ID appears and only after that start the loop:
ID | Qty_1 | Qty_2 |
---|---|---|
A | 1 | 9 |
A | 2 | 7 |
A | 3 | 4 |
B | 3 | 26 |
B | 2 | 24 |
B | 1 | 23 |
Each of the solutions is ok! Thank you!
CodePudding user response:
First compute the difference between 'Qty_1' and 'Qty_2' row by row, then group by 'ID' and compute cumulative sum:
df['Qty_2'] = df.assign(Qty_2=df['Qty_2'].sub(df['Qty_1'])) \
.groupby('ID')['Qty_2'].cumsum()
print(df)
# Output:
ID Qty_1 Qty_2
0 A 1 9
1 A 2 7
2 A 3 4
3 B 3 26
4 B 2 24
5 B 1 23
Setup:
data = {'ID': ['A', 'A', 'A', 'B', 'B', 'B'],
'Qty_1': [1, 2, 3, 3, 2, 1],
'Qty_2': [10, 0, 0, 29, 0, 0]}
df = pd.DataFrame(data)