Home > database >  Create a new column in pandas df with the result of .diff() function based on a condition in another
Create a new column in pandas df with the result of .diff() function based on a condition in another

Time:11-10

I need to apply function .diff() to a column. if the row value in another column is equal to the previous row of the same column.

Example:

import pandas as pd

import numpy as np

df=pd.DataFrame({'A':['Shrimp', 'Shrimp', 'Shrimp','Octopus','Octopus','Fish','Fish'],
                 'B':[10,11,15,25,30,5,15]})

df['C'] = (lambda x: x['B'].diff() if x['A'] == x['A'].shift(1) else 0)

Basically what I am looking for is to get the price variance for each product based on the previous purchase. I already sorted the DF by Product and Date.

Right now I did the .diff() to the DF but when the product changes it applies the functions anyways, So I need the condition if the previous row is a different product then the function must not be applied.

CodePudding user response:

Is this what you want?

>>> df['C'] = df.groupby('A')['B'].diff().fillna(0)
>>> df

         A   B     C
0   Shrimp  10   0.0
1   Shrimp  11   1.0
2   Shrimp  15   4.0
3  Octopus  25   0.0
4  Octopus  30   5.0
5     Fish   5   0.0
6     Fish  15  10.0
  • Related