I have a panadas dataframe shown below :
masterid price
1001 12
1001 12
1001 11
1001 14
1002 15
1003 10
1003 10
1003 10
1004 16
1004 17
I want to count the times when the price changes for a particular master id
Example : For masterid 1001 the price changed from 12 - 11 and from 11 -14 ( 2 times )
Expected output :
masterid price
1001 2
1002 0
1003 0
1004 1
I have tried using (df['price'].ne(df['price'].shift())
but not sure how to use group by in this case. Thanks in advance
CodePudding user response:
You are almost there. The next step is to use pd.groupby.apply
with your existing function and sum:
df.groupby('masterid').apply(
lambda df: (df['price'].ne(df['price'].shift())
).sum())
This stragegy should work pretty generally, though in some cases apply
will be slow.
CodePudding user response:
#calculate the price difference using diff
# take absolute value and compare with zero
# go another group by and take the count
df.assign(price=
df.groupby('masterid')['price']
.diff()
.abs()>0)
.groupby(['masterid'])['price']
.sum()
.reset_index()
masterid price
0 1001 2
1 1002 0
2 1003 0
3 1004 1