I have a dataframe with seasons as a column. Each season has two variety items. I want to compute each season-wise performance of all items with one item A
, where item A
performance will be marked as 100%.
code:
xdf = pd.DataFrame({'Season':[1,1,2,2,3,3],'item':['A','B','A','B','A','B'],'value':[25,30,50,75,40,60]})
xdf =
Season item value
0 1 A 25
1 1 B 30
2 2 A 50
3 2 B 75
4 3 A 40
5 3 B 60
Expected answer:
xdf =
Season item value %value
0 1 A 25 100
1 1 B 30 120
2 2 A 50 100
3 2 B 75 150
4 3 A 40 100
5 3 B 60 150
CodePudding user response:
Let us create multiindex on Season
and item
in order to simplify calculation:
s = xdf.set_index(['Season', 'item'])['value']
xdf['%value'] = s.div(s.xs('A', level=1)).mul(100).tolist()
Season item value %value
0 1 A 25 100.0
1 1 B 30 120.0
2 2 A 50 100.0
3 2 B 75 150.0
4 3 A 40 100.0
5 3 B 60 150.0
CodePudding user response:
If your rows are correctly ordered ('A' before 'B'), you can use pct_change
. If needed sort by item first.
xdf['%value'] = xdf.groupby('Season')['value'].pct_change().fillna(0) * 100 100
print(xdf)
# Output
Season item value %value
0 1 A 25 100.0
1 1 B 30 120.0
2 2 A 50 100.0
3 2 B 75 150.0
4 3 A 40 100.0
5 3 B 60 150.0