I need to apply function .diff() to a column. if the row value in another column is equal to the previous row of the same column.
Example:
import pandas as pd
import numpy as np
df=pd.DataFrame({'A':['Shrimp', 'Shrimp', 'Shrimp','Octopus','Octopus','Fish','Fish'],
'B':[10,11,15,25,30,5,15]})
df['C'] = (lambda x: x['B'].diff() if x['A'] == x['A'].shift(1) else 0)
Basically what I am looking for is to get the price variance for each product based on the previous purchase. I already sorted the DF by Product and Date.
Right now I did the .diff() to the DF but when the product changes it applies the functions anyways, So I need the condition if the previous row is a different product then the function must not be applied.
CodePudding user response:
Is this what you want?
>>> df['C'] = df.groupby('A')['B'].diff().fillna(0)
>>> df
A B C
0 Shrimp 10 0.0
1 Shrimp 11 1.0
2 Shrimp 15 4.0
3 Octopus 25 0.0
4 Octopus 30 5.0
5 Fish 5 0.0
6 Fish 15 10.0