Pandas dataframe row operation with a condition-CodePudding

I have a dataframe with information about a stock that looks like this:

Product ID	Initial stock	Initial unit cost	Reference	Quantity	Unit cost	Current stock
a	5	22	Purch.	4	24	9
a	5	22	Purch.	8	21	17
a	5	22	Sale	-4	25	13
a	5	22	Purch.	10	20	23
a	5	22	Sale	-15	22	8
b	14	3.5	Sale	10	4	4
b	14	3.5	Purch.	20	3	24
b	14	3.5	Sale	5	4	19
b	14	3.5	Purch.	2	3.5	21
c	27	1	Purch.	100	0.95	127
c	27	1	Purch.	3	1.1	130

Each row represents a purchase/sale of a certain product. Quantity represents the number of units purchased/sold at a given Unit cost. Current stock is the remaining stock after the purchase/sale. For every product, I want to calculate the Weighted Average Cost (WAC) after each sale/purchase. The procedure is the following:

For the first row of every product, WAC = (Initial stock * Initial unit cost Quantity * Unit cost) / Current stock just if Reference == 'Purch.'. If not, WAC = Initial unit cost.
For the next rows, WAC[i] = (Current stock[i-1] * WAC[i-1] Quantity[i] * Unit cost[i]) / Current stock[i] just if Reference[i] == 'Purch.'. If not, WAC[i] = WAC[i-1].

The next table shows what I'm looking for (WAC column and how to calculate it):

Product ID	Initial stock	Initial unit cost	Reference	Quantity	Unit cost	Current stock	(how to) WAC	WAC
a	5	22	Purch.	4	24	9	(522 424)/9	22.89
a	5	22	Purch.	8	21	17	(922.89 821)/17	22
a	5	22	Sale	-4	25	13	-	22
a	5	22	Purch.	10	20	23	(1322 1020)/23	21.13
a	5	22	Sale	-15	22	8	-	21.13
b	14	3.5	Sale	10	4	4	-	3.5
b	14	3.5	Purch.	20	3	24	(43.5 203)/24	3.08
b	14	3.5	Sale	5	4	19	-	3.08
b	14	3.5	Purch.	2	3.5	21	(193.08 23.5)/21	3.12
c	27	1	Purch.	100	0.95	127	(271 1000.95)/127	0.96
c	27	1	Purch.	3	1.1	130	(1270.96 31.1)/130	0.96

How would you do it using Pandas? I've tried to use a groupby and a cumsum, but I don't know how to introduce the "if" statement. After that, I want to summarize the information and just get the Product ID along with the final Stock and WAC, just like this:

Product ID	Current stock	WAC
a	8	21.13
b	21	3.12
c	130	0.96

Thank you in advance!

CodePudding user response：

You may create a function and call it using apply in a groupby dataframe.

I would try something like this

def calc_wac(df_):
  df_ = df_.copy()
  cs_wac = 0
  for counter, row in enumerate(df_.iterrows()):
    idx,row = row
    if counter==0:
      if row['Reference'] == 'Purch.':
        cs_wac  = row['Initial stock'] * row['Initial unit cost']   row['Quantity'] * row['Unit cost']
      else:
        cs_wac  = row['Current stock'] * row['Initial unit cost']
    elif row['Reference'] == 'Purch.':
      cs_wac  = row['Quantity'] * row['Unit cost']
    else:
      cs_wac *= row['Current stock']/df.loc[idx-1,'Current stock']
    df_.loc[idx, 'WAC'] = cs_wac/row['Current stock']
  return pd.Series({'Current stock': row['Current stock'], 'WAC':cs_wac/row['Current stock']})

This should return the summarized information when you call:

df.groupby('Product ID').apply(calc_wac)

If you want the full dataframe just change the function return to the entire dataframe return df_

CodePudding user response：

Hope I understood your question correct.

Code:

#Create new columns using lambda function
df['(how to)WAC']= df.apply(lambda row: (row['Intial stock']*row['Intial unit cost'] row['Quantity']*row['Unit cost'])/row['Current stock'] if row['Reference']=='Purch' else None, axis=1)



#Creating another column WAC, here it will gonna take data from '(how to)WAC' column.
#More, if its None will will take the above value. and if its the first value then it will take from Initial 

df['WAC']  = df.groupby(['Product ID'])['(how to) WAC'].ffill().fillna(df['Initial unit cost'])




#Group by the ID and display the last rows of each
df.groupby('Product ID').tail(1)[['Product ID','Current stock', 'WAC']]#