I have a dataframe with information about a stock that looks like this:
Product ID | Initial stock | Initial unit cost | Reference | Quantity | Unit cost | Current stock |
---|---|---|---|---|---|---|
a | 5 | 22 | Purch. | 4 | 24 | 9 |
a | 5 | 22 | Purch. | 8 | 21 | 17 |
a | 5 | 22 | Sale | -4 | 25 | 13 |
a | 5 | 22 | Purch. | 10 | 20 | 23 |
a | 5 | 22 | Sale | -15 | 22 | 8 |
b | 14 | 3.5 | Sale | 10 | 4 | 4 |
b | 14 | 3.5 | Purch. | 20 | 3 | 24 |
b | 14 | 3.5 | Sale | 5 | 4 | 19 |
b | 14 | 3.5 | Purch. | 2 | 3.5 | 21 |
c | 27 | 1 | Purch. | 100 | 0.95 | 127 |
c | 27 | 1 | Purch. | 3 | 1.1 | 130 |
Each row represents a purchase/sale of a certain product. Quantity
represents the number of units purchased/sold at a given Unit cost
. Current stock
is the remaining stock after the purchase/sale. For every product, I want to calculate the Weighted Average Cost (WAC) after each sale/purchase. The procedure is the following:
For the first row of every product,
WAC = (Initial stock * Initial unit cost Quantity * Unit cost) / Current stock
just ifReference == 'Purch.'
. If not,WAC = Initial unit cost
.For the next rows,
WAC[i] = (Current stock[i-1] * WAC[i-1] Quantity[i] * Unit cost[i]) / Current stock[i]
just ifReference[i] == 'Purch.'
. If not,WAC[i] = WAC[i-1]
.
The next table shows what I'm looking for (WAC
column and how to calculate it):
Product ID | Initial stock | Initial unit cost | Reference | Quantity | Unit cost | Current stock | (how to) WAC | WAC |
---|---|---|---|---|---|---|---|---|
a | 5 | 22 | Purch. | 4 | 24 | 9 | (5*22 4*24)/9 | 22.89 |
a | 5 | 22 | Purch. | 8 | 21 | 17 | (9*22.89 8*21)/17 | 22 |
a | 5 | 22 | Sale | -4 | 25 | 13 | - | 22 |
a | 5 | 22 | Purch. | 10 | 20 | 23 | (13*22 10*20)/23 | 21.13 |
a | 5 | 22 | Sale | -15 | 22 | 8 | - | 21.13 |
b | 14 | 3.5 | Sale | 10 | 4 | 4 | - | 3.5 |
b | 14 | 3.5 | Purch. | 20 | 3 | 24 | (4*3.5 20*3)/24 | 3.08 |
b | 14 | 3.5 | Sale | 5 | 4 | 19 | - | 3.08 |
b | 14 | 3.5 | Purch. | 2 | 3.5 | 21 | (19*3.08 2*3.5)/21 | 3.12 |
c | 27 | 1 | Purch. | 100 | 0.95 | 127 | (27*1 100*0.95)/127 | 0.96 |
c | 27 | 1 | Purch. | 3 | 1.1 | 130 | (127*0.96 3*1.1)/130 | 0.96 |
How would you do it using Pandas? I've tried to use a groupby and a cumsum, but I don't know how to introduce the "if" statement. After that, I want to summarize the information and just get the Product ID
along with the final Stock
and WAC
, just like this:
Product ID | Current stock | WAC |
---|---|---|
a | 8 | 21.13 |
b | 21 | 3.12 |
c | 130 | 0.96 |
Thank you in advance!
CodePudding user response:
You may create a function and call it using apply
in a groupby
dataframe.
I would try something like this
def calc_wac(df_):
df_ = df_.copy()
cs_wac = 0
for counter, row in enumerate(df_.iterrows()):
idx,row = row
if counter==0:
if row['Reference'] == 'Purch.':
cs_wac = row['Initial stock'] * row['Initial unit cost'] row['Quantity'] * row['Unit cost']
else:
cs_wac = row['Current stock'] * row['Initial unit cost']
elif row['Reference'] == 'Purch.':
cs_wac = row['Quantity'] * row['Unit cost']
else:
cs_wac *= row['Current stock']/df.loc[idx-1,'Current stock']
df_.loc[idx, 'WAC'] = cs_wac/row['Current stock']
return pd.Series({'Current stock': row['Current stock'], 'WAC':cs_wac/row['Current stock']})
This should return the summarized information when you call:
df.groupby('Product ID').apply(calc_wac)
If you want the full dataframe just change the function return to the entire dataframe return df_
CodePudding user response:
Hope I understood your question correct.
Code:
#Create new columns using lambda function
df['(how to)WAC']= df.apply(lambda row: (row['Intial stock']*row['Intial unit cost'] row['Quantity']*row['Unit cost'])/row['Current stock'] if row['Reference']=='Purch' else None, axis=1)
#Creating another column WAC, here it will gonna take data from '(how to)WAC' column.
#More, if its None will will take the above value. and if its the first value then it will take from Initial
df['WAC'] = df.groupby(['Product ID'])['(how to) WAC'].ffill().fillna(df['Initial unit cost'])
#Group by the ID and display the last rows of each
df.groupby('Product ID').tail(1)[['Product ID','Current stock', 'WAC']]#