I have a pandas dataframe like this:
close low max_close higher_low
0 2 1 True False
1 3 4 False True
2 1 2 True False
3 0 3 False False
4 5 2 False True
5 4 5 False True
6 3 3 True False
7 6 7 False True
and could be created with the code:
import pandas as pd
df = pd.DataFrame(
{
'close': [2, 3, 1, 0, 5, 4, 3, 6],
'low': [1, 4, 2, 3, 2, 5, 3, 7],
'max_close': [True, False, True, False, False, False, True, False],
'higher_low': [False, True, False, False, True, True, False, True]
}
)
For any row with a True
value in the max_close column, I want to find the first row in the next rows where the value in the higher_low column is True
and the value in the low column is greater than the value in the close column and also this row must be at most in the next 2 rows after the row where the value in max_close column was True
.
So the output should be :
close low max_close higher_low
1 3 4 False True
7 6 7 False True
(Index 4 is not in the output because in this row: low < close
. Also, index 5 is not in the output because it's three rows after index 2, while we have a condition that it should be at most in the next 2 rows.)
Also, it's my priority not to use any for-loops in the code.
Have you any idea about this?
CodePudding user response:
Use -
lookup = 2
indices = []
for i in range(1, lookup 1):
if i > 1:
tmp = df.loc[(df[df['max_close']].loc[:-(i-1)].index i)]
else:
tmp = df.loc[(df[df['max_close']].index i)]
tmp_ind = tmp[(tmp['higher_low']) & (tmp['low']>tmp['close'])].index
indices = tmp_ind.tolist()
df.loc[set(indices)]
Output
close low max_close higher_low
1 3 4 False True
7 6 7 False True
CodePudding user response:
Create virtual groups from max_close
column then keep the 3 first rows (1 row for max_close
and the 2 following). Finally, filter out on your 2 conditions:
out = (df.groupby(df['max_close'].cumsum()).head(3)
.query("higher_low & (close < low)"))
print(out)
# Output
close low max_close higher_low
1 3 4 False True
7 6 7 False True