I have a dataframe with open, high, low, close prices of a stock. I want to add an additional column that has the percent change between today's open and yesterday's high. This is my current implementation, however, the resulting column contains percent changes between the current day's high and open.
df
open high low close
0 100 110 95 103
1 103 113 103 111
2 111 132 109 124
3 124 136 114 130
My attempt (incorrect):
df['prevhigh_curropen'] = (df['open'] - df['high']).shift(-1) / df['high'].shift(-1)
Output (incorrect):
open high low close prevhigh_curropen
0 100 110 95 103 -0.091
1 103 113 103 111 -0.089
2 111 132 109 124 -0.159
3 124 136 114 130 -0.088
Desired output:
open high low close prevhigh_curropen
0 100 110 95 103 nan
1 103 113 103 111 -0.064
2 111 132 109 124 -0.018
3 124 136 114 130 -0.061
Is there a non-iterative way to do this like I attempted above?
CodePudding user response:
Your formula is wrong, you have to use df['high'].shift()
:
df = pd.DataFrame({'open': range(1, 11), 'high': range(1, 11)})
df['prevhigh_curropen'] = df['open'].sub(df['high'].shift()) \
.div(df['high'].shift()) \
.mul(100)
>>> df
open high prevhigh_curropen
0 1 1 NaN
1 2 2 100.000000
2 3 3 50.000000
3 4 4 33.333333
4 5 5 25.000000
5 6 6 20.000000
6 7 7 16.666667
7 8 8 14.285714
8 9 9 12.500000
9 10 10 11.111111
For your sample the output is:
>>> df
open high low close prevhigh_curropen
0 100 110 95 103 NaN
1 103 113 103 111 -6.363636
2 111 132 109 124 -1.769912
3 124 136 114 130 -6.060606
The first value is NaN
because we don't know the high
value from the previous day.
CodePudding user response:
We can simplify the terms slightly from (a - b) / b
to (a / b) - (b / b)
to (a / b) - 1
.
Mathematical Operators:
df['prevhigh_curropen'] = (df['open'] / df['high'].shift()) - 1
or with Series Methods:
df['prevhigh_curropen'] = df['open'].div(df['high'].shift()).sub(1)
*The benefit here is that we only need to shift once, and maintain 1 copy of df['high'].shift()
Resulting df
:
open high low close prevhigh_curropen
0 100 110 95 103 NaN
1 103 113 103 111 -0.063636
2 111 132 109 124 -0.017699
3 124 136 114 130 -0.060606
Setup Used:
import pandas as pd
df = pd.DataFrame({
'open': [100, 103, 111, 124],
'high': [110, 113, 132, 136],
'low': [95, 103, 109, 114],
'close': [103, 111, 124, 130]
})