Say I have the following df called df_trading_pair
:
Start Date Open Price High Price Low Price Close Price End Date
0 2022-07-20 08:00:00 0.19277 0.19324 0.19225 0.19324 2022-07-20 08:04:59.999
1 2022-07-20 08:05:00 0.19321 0.194 0.1932 0.19388 2022-07-20 08:09:59.999
2 2022-07-20 08:10:00 0.19387 0.195 0.19387 0.19489 2022-07-20 08:14:59.999
3 2022-07-20 08:15:00 0.19496 0.19628 0.19495 0.19626 2022-07-20 08:19:59.999
4 2022-07-20 08:20:00 0.19625 0.20406 0.19625 0.2035 2022-07-20 08:24:59.999
I have been trying to figure out a simple way to get the percentage change of the first 4 elements in Open Price
column vs the last 4 elements in Close Price
, so I end up getting the following output in a new_df
:
Close Price vs Open Price % change
0 0.0057
1 0.0087
2 0.0123
3 0.0438
dtype: float64
At first, I thought the following sentence should've worked perfectly, after all both arrays would have 4 elements and would contain exactly the values that I need:
new_df["Close Price vs Open Price % change"] = (df_trading_pair["Close Price"][1:]-df_trading_pair["Open Price"][:-1])/df_trading_pair["Open Price"][:-1]
However, that sentence ended up throwing this output:
Close Price vs Open Price % change
0 NaN
1 0.003468
2 0.005261
3 0.006668
4 NaN
dtype: float64
Which I don't get why, I also tried this other sentence:
new_df["Close Price vs Open Price % Change"] = [(y-x)/x*100 for x in df_trading_pair["Open Price"][:-1] for y in df_trading_pair["Close Price"][1:]]
Which in my opinion should also have done what I was looking for, but unfortunately it did not and threw the following error:
ValueError: Length of values (16) does not match length of index (5)
So, I would like to get some assistance here, what else could I do in order to get the desired output?
CodePudding user response:
You need to use shift
, else pandas will realign your indices:
df['Close Price'].shift(-1).sub(df['Open Price']).div(df['Open Price'])[:-1]
Output:
0 0.005758
1 0.008695
2 0.012328
3 0.043804
dtype: float64
Your approach would work with numpy arrays as there is no index realignment:
c = df['Close Price'][1:].to_numpy()
o = df['Open Price'][:-1].to_numpy()
out = (c-o)/o
Output:
array([0.00575816, 0.0086952 , 0.01232785, 0.04380386])