I'm trying to calculate the percentage of increase in a price, based on the price of the previous day and the price of two days before. The problem I'm fancing is that I don't know how to access to the element of two days before properly.
The dataset is the following:
slug | datetime | price_usd |
---|---|---|
bitcoin | 2022-06-01 00:00:00 00:00 | 29799.079714 |
ethereum | 2022-06-01 00:00:00 00:00 | 1823.569357 |
bitcoin | 2022-06-02 00:00:00 00:00 | 30467.487741 |
ethereum | 2022-06-02 00:00:00 00:00 | 1775.07862 |
bitcoin | 2022-06-03 00:00:00 00:00 | 29704.391357 |
ethereum | 2022-06-03 00:00:00 00:00 | 1834.15057 |
bitcoin | 2022-06-04 00:00:00 00:00 | 29832.914225 |
ethereum | 2022-06-04 00:00:00 00:00 | 1801.609446 |
bitcoin | 2022-06-05 00:00:00 00:00 | 29906.661748 |
ethereum | 2022-06-05 00:00:00 00:00 | 1805.204989 |
bitcoin | 2022-06-06 00:00:00 00:00 | 31364.535502 |
ethereum | 2022-06-06 00:00:00 00:00 | 1858.546508 |
The output dataset should be the following.
increase_24 is percentage increase over the previous day and increase_48 is percentage increase over 2 previous days
slug | datetime | price_usd | %increase_24 | %increase_48 |
---|---|---|---|---|
bitcoin | 2022-06-01 00:00:00 00:00 | 29799.079714 | ||
ethereum | 2022-06-01 00:00:00 00:00 | 1823.569357 | ||
bitcoin | 2022-06-02 00:00:00 00:00 | 30467.487741 | 0.0243 | |
ethereum | 2022-06-02 00:00:00 00:00 | 1775.07862 | -0.0265 | |
bitcoin | 2022-06-03 00:00:00 00:00 | 29704.391357 | -0.0250 | -0.0031 |
ethereum | 2022-06-03 00:00:00 00:00 | 1834.15057 | 0.0332 | 0.0058 |
bitcoin | 2022-06-04 00:00:00 00:00 | 29832.914225 | 0.0043 | -0.0208 |
ethereum | 2022-06-04 00:00:00 00:00 | 1801.609446 | -0.0177 | 0.0149 |
bitcoin | 2022-06-05 00:00:00 00:00 | 29906.661748 | 0.0024 | 0.0068 |
ethereum | 2022-06-05 00:00:00 00:00 | 1805.204989 | 0.0019 | -0.0015 |
I think the solution should be something like that...
df.set_index('datetime', inplace=True)
df_aux = df.shift(periods=1, freq='D')
pd.concat([df_btc, df_aux], axis=1)
but when I try to concat, python shows me the following error:
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
Any Idea on How to obtain the 2 columns on the right? (%increase_24 and %increase_48)
Thanks in advance
CodePudding user response:
groupby
pct_change
g = df.groupby('slug')
df['%increase_24'] = g['price_usd'].pct_change(1)
df['%increase_48'] = g['price_usd'].pct_change(2)
slug datetime price_usd %increase_24 %increase_48
0 bitcoin 2022-06-01 00:00:00 00:00 29799.079714 NaN NaN
1 ethereum 2022-06-01 00:00:00 00:00 1823.569357 NaN NaN
2 bitcoin 2022-06-02 00:00:00 00:00 30467.487741 0.022430 NaN
3 ethereum 2022-06-02 00:00:00 00:00 1775.078620 -0.026591 NaN
4 bitcoin 2022-06-03 00:00:00 00:00 29704.391357 -0.025046 -0.003178
5 ethereum 2022-06-03 00:00:00 00:00 1834.150570 0.033278 0.005802
6 bitcoin 2022-06-04 00:00:00 00:00 29832.914225 0.004327 -0.020828
7 ethereum 2022-06-04 00:00:00 00:00 1801.609446 -0.017742 0.014946
8 bitcoin 2022-06-05 00:00:00 00:00 29906.661748 0.002472 0.006809
9 ethereum 2022-06-05 00:00:00 00:00 1805.204989 0.001996 -0.015781
10 bitcoin 2022-06-06 00:00:00 00:00 31364.535502 0.048747 0.051340
11 ethereum 2022-06-06 00:00:00 00:00 1858.546508 0.029549 0.031603
PS: I am assuming the datetime
values per slug
are continuous with frequency of 1 days
CodePudding user response:
Try groupby
first with pct_change
:
df['%increase_24'] = df.groupby('slug')['price_usd'].pct_change(1)
df['%increase_48'] = df.groupby('slug')['price_usd'].pct_change(2)
[out]
slug datetime price_usd %increase_24 %increase_48
0 bitcoin 2022-06-01 00:00:00 00:00 29799.079714 NaN NaN
1 ethereum 2022-06-01 00:00:00 00:00 1823.569357 NaN NaN
2 bitcoin 2022-06-02 00:00:00 00:00 30467.487741 0.022430 NaN
3 ethereum 2022-06-02 00:00:00 00:00 1775.078620 -0.026591 NaN
4 bitcoin 2022-06-03 00:00:00 00:00 29704.391357 -0.025046 -0.003178
5 ethereum 2022-06-03 00:00:00 00:00 1834.150570 0.033278 0.005802
6 bitcoin 2022-06-04 00:00:00 00:00 29832.914225 0.004327 -0.020828
7 ethereum 2022-06-04 00:00:00 00:00 1801.609446 -0.017742 0.014946
8 bitcoin 2022-06-05 00:00:00 00:00 29906.661748 0.002472 0.006809
9 ethereum 2022-06-05 00:00:00 00:00 1805.204989 0.001996 -0.015781
10 bitcoin 2022-06-06 00:00:00 00:00 31364.535502 0.048747 0.051340
11 ethereum 2022-06-06 00:00:00 00:00 1858.546508 0.029549 0.031603