I have a df like this:
Low isLower isPrevHigher isNextHigher
0 22470.0 True False True
1 22480.0 NaN False True
2 22576.6 NaN False True
3 22600.4 NaN False False
4 22583.5 NaN True True
5 22652.2 NaN False True
6 22656.8 NaN False False
7 22646.5 NaN True False
8 22600.0 NaN True False
9 22555.0 NaN True True
10 22580.1 NaN False True
11 22620.0 NaN False True
12 22682.2 NaN False False
13 22681.0 NaN True True
14 22710.8 NaN False False
15 22657.2 NaN True False
16 22623.0 NaN True True
17 22634.0 NaN False True
18 22660.0 NaN False True
19 22673.6 NaN False True
20 22721.2 NaN False False
21 22580.0 NaN True False
22 22552.6 NaN True False
23 22382.6 True True False
24 22353.0 True True False
25 22341.7 True True False
26 22312.4 True True False
**27 22256.4 True True True**
28 22310.6 True False False
29 22286.0 True True True
30 22306.8 True False True
31 22386.3 True False False
I want to drop all rows after the first isLower == True & isPrevHigher == True & isNextHigher == True.
So everything after row 27.
CodePudding user response:
drop_row = df[df[['isLower', 'isPrevHigher', 'isNextHigher']].eq(True).all(axis=1)].index[0]
df = df[df.index <= drop_row]
print(df)
Output:
Low isLower isPrevHigher isNextHigher
0 22470.0 True False True
1 22480.0 NaN False True
2 22576.6 NaN False True
3 22600.4 NaN False False
4 22583.5 NaN True True
5 22652.2 NaN False True
6 22656.8 NaN False False
7 22646.5 NaN True False
8 22600.0 NaN True False
9 22555.0 NaN True True
10 22580.1 NaN False True
11 22620.0 NaN False True
12 22682.2 NaN False False
13 22681.0 NaN True True
14 22710.8 NaN False False
15 22657.2 NaN True False
16 22623.0 NaN True True
17 22634.0 NaN False True
18 22660.0 NaN False True
19 22673.6 NaN False True
20 22721.2 NaN False False
21 22580.0 NaN True False
22 22552.6 NaN True False
23 22382.6 True True False
24 22353.0 True True False
25 22341.7 True True False
26 22312.4 True True False
27 22256.4 True True True
CodePudding user response:
You may want to drop rows on/after the first row with ALL empty values:
# create another data frame
df = pd.DataFrame(
{'direction': ['north', 'east', 'south', None, 'up', 'down'],
'amount': [10, 20, 30, None, 100, 200]})
# does the whole row consist of `None`
df['row_is_none'] = df.isna().all(axis=1)
# calculate the cumulative sum of the new column
df['row_is_non_accum'] = df['row_is_none'].cumsum()
# create boolean mask and perform drop (not shown to save space)
print(df)
direction amount row_is_none row_is_non_accum
0 north 10.0 False 0
1 east 20.0 False 0
2 south 30.0 False 0
3 None NaN True 1
4 up 100.0 False 1
5 down 200.0 False 1
CodePudding user response:
This will find the where all the specified columns = True, then find the lowest index making it into a number variable the using iloc to final all the index's specified
first_all_true = df.iloc[np.where((df['isLower'] == True) & (df['isPrevHigher'] == True) & (df['isNextHigher'] == True))]index[0]
df.iloc[0:first_all_true 1]
CodePudding user response:
Using boolean indexing with help of all
, the boolean NOT (~
), and cummin
:
df[(~df[['isLower', 'isPrevHigher', 'isNextHigher']].eq(True).all(1)).cummin()]
NB. untested answer