I have the dataframe with max and min date and the value('price') - the values can be grouped by third column (let's say 'category'). How can the difference between price between max and min date be calculated using Pandas for each of the categories? Do I need groupby for that or it can be achieved using index?
So that it will be 783.44 for category A and 5.08 for category B
Thanks!
CodePudding user response:
Use:
print (df)
date price category
0 2018-06-03 7632.09 a
1 2018-06-03 123.29 a
2 2018-04-05 6848.65 a
3 2018-04-05 118.21 a
df2 = (df.groupby(['category', 'date'])['price']
.agg(['max','min'])
.add_prefix('_')
.eval('_max - _min')
.reset_index(name='diff'))
print (df2)
category date diff
0 a 2018-04-05 6730.44
1 a 2018-06-03 7508.80