I was working with a dataframe that looked somewhat as below:
current | sold |
---|---|
5 | 6 |
10 | 8 |
15 | 10 |
20 | 12 |
My aim is to fins the available and remaining values. But my problem is that the values for the available and remaining column depends on the previous row value as below:
remaining = max((available current - sold),0)
available = remaining_prev current - sold
The final table is expected to look as follows:
current | sold | available | remaining |
---|---|---|---|
5 | 6 | 0 | 0 |
10 | 8 | 2 | 4 |
15 | 10 | 9 | 14 |
20 | 12 | 22 | 30 |
Is there any way to achieve the same. I've been trying this for a couple of days. Method that doesn't require looping will be much appreciated.
CodePudding user response:
import pandas as pd
import numpy as np
df['remaining'] = np.maximum(df['current'] - df['sold'], 0)
df['available'] = (df['remaining'] * 2).cumsum()
df['remaining'] = df['available'] - df['remaining']
df
current sold remaining available
0 5 6 0 0
1 10 8 2 4
2 15 10 9 14
3 20 12 22 30
CodePudding user response:
You calculation don't really make sense to me in terms of the meaning of "available" and "remaining" (is it a stock?). Your formula is equivalent to counting 2 times what I would consider a real "remaining/available" stock.
Nevertheless, mathematically, you could convert:
remaining = max((available current - sold),0)
available = remaining_prev current - sold
to:
remaining = remaining_prev 2*max(current - sold, 0)
This gives:
s = df['current'].sub(df['sold']).clip(0)
df['remaining'] = s.mul(2).cumsum()
df['available'] = df['remaining'].shift(fill_value=0) s
output:
current sold remaining available
0 5 6 0 0
1 10 8 4 2
2 15 10 14 9
3 20 12 30 22