df = pd.DataFrame({'year':[2019,2019,2019,2019,2020,2020,2020,2020,2021,2021,2021,2021],'month':[1,2,3,4,1,2,3,4,1,2,3,4],'values':[33,63,24,19,22,30,60,25,23,24,30,80]})
How can I add the percentage of change between each year/month. For example, if I compare 2019 to 2020 for the month 1 it would be:
22/30 = -33%
the expected result should show 0 for the first year (2019) and for the next years show the increase/decrease percentage-wise.
What I have tried is groupby transform but couldn't get any result.
CodePudding user response:
You can use a shift per year. For this increase the year by 1 and merge
the DataFrame with itself. Then compute the percentage change:
cols = ['year', 'month']
vals = df[cols].merge(df.assign(year=df['year'] 1), on=cols, how='left')['values']
df['change'] = df['values'].sub(vals).div(vals).mul(100).fillna(0)
output:
year month values change
0 2019 1 33 0.000000
1 2019 2 63 0.000000
2 2019 3 24 0.000000
3 2019 4 19 0.000000
4 2020 1 22 -33.333333
5 2020 2 30 -52.380952
6 2020 3 60 150.000000
7 2020 4 25 31.578947
8 2021 1 23 4.545455
9 2021 2 24 -20.000000
10 2021 3 30 -50.000000
11 2021 4 80 220.000000
CodePudding user response:
If you reshape your df
a bit you could use pd.pct_change()
df = df.set_index(['year', 'month']).unstack(level=0)
print(df)
values
year 2019 2020 2021
month
1 33 22 23
2 63 30 24
3 24 60 30
4 19 25 80
res = df.pct_change(axis='columns').mul(100).fillna(0)
print(res)
values
year 2019 2020 2021
month
1 0.0 -33.333333 4.545455
2 0.0 -52.380952 -20.000000
3 0.0 150.000000 -50.000000
4 0.0 31.578947 220.000000