Home > Net >  Python get percentage increase based on datetime columns on previous days
Python get percentage increase based on datetime columns on previous days

Time:06-30

I'm trying to calculate the percentage of increase in a price, based on the price of the previous day and the price of two days before. The problem I'm fancing is that I don't know how to access to the element of two days before properly.

The dataset is the following:

slug datetime price_usd
bitcoin 2022-06-01 00:00:00 00:00 29799.079714
ethereum 2022-06-01 00:00:00 00:00 1823.569357
bitcoin 2022-06-02 00:00:00 00:00 30467.487741
ethereum 2022-06-02 00:00:00 00:00 1775.07862
bitcoin 2022-06-03 00:00:00 00:00 29704.391357
ethereum 2022-06-03 00:00:00 00:00 1834.15057
bitcoin 2022-06-04 00:00:00 00:00 29832.914225
ethereum 2022-06-04 00:00:00 00:00 1801.609446
bitcoin 2022-06-05 00:00:00 00:00 29906.661748
ethereum 2022-06-05 00:00:00 00:00 1805.204989
bitcoin 2022-06-06 00:00:00 00:00 31364.535502
ethereum 2022-06-06 00:00:00 00:00 1858.546508

The output dataset should be the following.

increase_24 is percentage increase over the previous day and increase_48 is percentage increase over 2 previous days

slug datetime price_usd %increase_24 %increase_48
bitcoin 2022-06-01 00:00:00 00:00 29799.079714
ethereum 2022-06-01 00:00:00 00:00 1823.569357
bitcoin 2022-06-02 00:00:00 00:00 30467.487741 0.0243
ethereum 2022-06-02 00:00:00 00:00 1775.07862 -0.0265
bitcoin 2022-06-03 00:00:00 00:00 29704.391357 -0.0250 -0.0031
ethereum 2022-06-03 00:00:00 00:00 1834.15057 0.0332 0.0058
bitcoin 2022-06-04 00:00:00 00:00 29832.914225 0.0043 -0.0208
ethereum 2022-06-04 00:00:00 00:00 1801.609446 -0.0177 0.0149
bitcoin 2022-06-05 00:00:00 00:00 29906.661748 0.0024 0.0068
ethereum 2022-06-05 00:00:00 00:00 1805.204989 0.0019 -0.0015

I think the solution should be something like that...

df.set_index('datetime', inplace=True)
df_aux = df.shift(periods=1, freq='D')
pd.concat([df_btc, df_aux],  axis=1)

but when I try to concat, python shows me the following error:

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Any Idea on How to obtain the 2 columns on the right? (%increase_24 and %increase_48)

Thanks in advance

CodePudding user response:

groupby pct_change

g = df.groupby('slug')
df['%increase_24'] = g['price_usd'].pct_change(1)
df['%increase_48'] = g['price_usd'].pct_change(2)

        slug                  datetime     price_usd  %increase_24  %increase_48
0    bitcoin 2022-06-01 00:00:00 00:00  29799.079714           NaN           NaN
1   ethereum 2022-06-01 00:00:00 00:00   1823.569357           NaN           NaN
2    bitcoin 2022-06-02 00:00:00 00:00  30467.487741      0.022430           NaN
3   ethereum 2022-06-02 00:00:00 00:00   1775.078620     -0.026591           NaN
4    bitcoin 2022-06-03 00:00:00 00:00  29704.391357     -0.025046     -0.003178
5   ethereum 2022-06-03 00:00:00 00:00   1834.150570      0.033278      0.005802
6    bitcoin 2022-06-04 00:00:00 00:00  29832.914225      0.004327     -0.020828
7   ethereum 2022-06-04 00:00:00 00:00   1801.609446     -0.017742      0.014946
8    bitcoin 2022-06-05 00:00:00 00:00  29906.661748      0.002472      0.006809
9   ethereum 2022-06-05 00:00:00 00:00   1805.204989      0.001996     -0.015781
10   bitcoin 2022-06-06 00:00:00 00:00  31364.535502      0.048747      0.051340
11  ethereum 2022-06-06 00:00:00 00:00   1858.546508      0.029549      0.031603

PS: I am assuming the datetime values per slug are continuous with frequency of 1 days

CodePudding user response:

Try groupby first with pct_change:

df['%increase_24'] = df.groupby('slug')['price_usd'].pct_change(1)
df['%increase_48'] = df.groupby('slug')['price_usd'].pct_change(2)

[out]

        slug                   datetime     price_usd  %increase_24  %increase_48
0    bitcoin  2022-06-01 00:00:00 00:00  29799.079714           NaN           NaN
1   ethereum  2022-06-01 00:00:00 00:00   1823.569357           NaN           NaN
2    bitcoin  2022-06-02 00:00:00 00:00  30467.487741      0.022430           NaN
3   ethereum  2022-06-02 00:00:00 00:00   1775.078620     -0.026591           NaN
4    bitcoin  2022-06-03 00:00:00 00:00  29704.391357     -0.025046     -0.003178
5   ethereum  2022-06-03 00:00:00 00:00   1834.150570      0.033278      0.005802
6    bitcoin  2022-06-04 00:00:00 00:00  29832.914225      0.004327     -0.020828
7   ethereum  2022-06-04 00:00:00 00:00   1801.609446     -0.017742      0.014946
8    bitcoin  2022-06-05 00:00:00 00:00  29906.661748      0.002472      0.006809
9   ethereum  2022-06-05 00:00:00 00:00   1805.204989      0.001996     -0.015781
10   bitcoin  2022-06-06 00:00:00 00:00  31364.535502      0.048747      0.051340
11  ethereum  2022-06-06 00:00:00 00:00   1858.546508      0.029549      0.031603
  • Related