Home > Software engineering >  Pandas dataframe row operation with a condition
Pandas dataframe row operation with a condition

Time:10-22

I have a dataframe with information about a stock that looks like this:

Product ID Initial stock Initial unit cost Reference Quantity Unit cost Current stock
a 5 22 Purch. 4 24 9
a 5 22 Purch. 8 21 17
a 5 22 Sale -4 25 13
a 5 22 Purch. 10 20 23
a 5 22 Sale -15 22 8
b 14 3.5 Sale 10 4 4
b 14 3.5 Purch. 20 3 24
b 14 3.5 Sale 5 4 19
b 14 3.5 Purch. 2 3.5 21
c 27 1 Purch. 100 0.95 127
c 27 1 Purch. 3 1.1 130

Each row represents a purchase/sale of a certain product. Quantity represents the number of units purchased/sold at a given Unit cost. Current stock is the remaining stock after the purchase/sale. For every product, I want to calculate the Weighted Average Cost (WAC) after each sale/purchase. The procedure is the following:

  • For the first row of every product, WAC = (Initial stock * Initial unit cost Quantity * Unit cost) / Current stock just if Reference == 'Purch.'. If not, WAC = Initial unit cost.

  • For the next rows, WAC[i] = (Current stock[i-1] * WAC[i-1] Quantity[i] * Unit cost[i]) / Current stock[i] just if Reference[i] == 'Purch.'. If not, WAC[i] = WAC[i-1].

The next table shows what I'm looking for (WAC column and how to calculate it):

Product ID Initial stock Initial unit cost Reference Quantity Unit cost Current stock (how to) WAC WAC
a 5 22 Purch. 4 24 9 (5*22 4*24)/9 22.89
a 5 22 Purch. 8 21 17 (9*22.89 8*21)/17 22
a 5 22 Sale -4 25 13 - 22
a 5 22 Purch. 10 20 23 (13*22 10*20)/23 21.13
a 5 22 Sale -15 22 8 - 21.13
b 14 3.5 Sale 10 4 4 - 3.5
b 14 3.5 Purch. 20 3 24 (4*3.5 20*3)/24 3.08
b 14 3.5 Sale 5 4 19 - 3.08
b 14 3.5 Purch. 2 3.5 21 (19*3.08 2*3.5)/21 3.12
c 27 1 Purch. 100 0.95 127 (27*1 100*0.95)/127 0.96
c 27 1 Purch. 3 1.1 130 (127*0.96 3*1.1)/130 0.96

How would you do it using Pandas? I've tried to use a groupby and a cumsum, but I don't know how to introduce the "if" statement. After that, I want to summarize the information and just get the Product ID along with the final Stock and WAC, just like this:

Product ID Current stock WAC
a 8 21.13
b 21 3.12
c 130 0.96

Thank you in advance!

CodePudding user response:

You may create a function and call it using apply in a groupby dataframe.

I would try something like this

def calc_wac(df_):
  df_ = df_.copy()
  cs_wac = 0
  for counter, row in enumerate(df_.iterrows()):
    idx,row = row
    if counter==0:
      if row['Reference'] == 'Purch.':
        cs_wac  = row['Initial stock'] * row['Initial unit cost']   row['Quantity'] * row['Unit cost']
      else:
        cs_wac  = row['Current stock'] * row['Initial unit cost']
    elif row['Reference'] == 'Purch.':
      cs_wac  = row['Quantity'] * row['Unit cost']
    else:
      cs_wac *= row['Current stock']/df.loc[idx-1,'Current stock']
    df_.loc[idx, 'WAC'] = cs_wac/row['Current stock']
  return pd.Series({'Current stock': row['Current stock'], 'WAC':cs_wac/row['Current stock']})

This should return the summarized information when you call:

df.groupby('Product ID').apply(calc_wac)

If you want the full dataframe just change the function return to the entire dataframe return df_

CodePudding user response:

Hope I understood your question correct.

Code:

#Create new columns using lambda function
df['(how to)WAC']= df.apply(lambda row: (row['Intial stock']*row['Intial unit cost'] row['Quantity']*row['Unit cost'])/row['Current stock'] if row['Reference']=='Purch' else None, axis=1)



#Creating another column WAC, here it will gonna take data from '(how to)WAC' column.
#More, if its None will will take the above value. and if its the first value then it will take from Initial 

df['WAC']  = df.groupby(['Product ID'])['(how to) WAC'].ffill().fillna(df['Initial unit cost'])




#Group by the ID and display the last rows of each
df.groupby('Product ID').tail(1)[['Product ID','Current stock', 'WAC']]#
  • Related