Home > Software engineering >  Pandas dataframe row operation within a group
Pandas dataframe row operation within a group

Time:10-22

I have a dataframe with information about a stock that looks like this:

Product ID Initial stock Quantity
a1 5 -2
a1 5 -1
a1 5 7
a2 10 2
a2 10 -6
b1 14 -9
b1 14 11

Each row represents a purchase/sale of a certain product. I want to add a new column to know the current stock of the product. The first row of every product should be: Current stock = Inital stock Quantity and the next ones: Current stock[i] = Current stock[i-1] Quantity[i]

Product ID Initial stock Quantity Current stock
a1 5 -2 3
a1 5 -1 2
a1 5 7 9
a2 10 2 12
a2 10 -6 6
b1 14 -9 5
b1 14 11 16

In Excel is quite easy, but how would you do it using Pandas? I've tried grouping by Product ID, but I don't know how to introduce the condition of "the first row of every product"

CodePudding user response:

Compute a groupby.cumsum of the quantity and subtract it from the Initial stock :

df['Current stock'] = (df['Initial stock']
                         .add(df.groupby('Product ID')['Quantity'].cumsum())
                       )

output:

  Product ID  Initial stock  Quantity  Current stock
0         a1              5        -2              3
1         a1              5        -1              2
2         a1              5         7              9
3         a2             10         2             12
4         a2             10        -6              6
5         b1             14        -9              5
6         b1             14        11             16
  • Related