Home > database >  How to select rows filtered with condition on the previous and the next rows in pandas and put them
How to select rows filtered with condition on the previous and the next rows in pandas and put them

Time:01-11

Considering the following dataframe df :

df = pd.DataFrame(
    {
        "col1": [0,1,2,3,4,5,6,7,8,9,10],
        "col2": ["A","B","C","D","E","F","G","H","I","J","K"],
        "col3": [1e-0,1e-1,1e-2,1e-3,1e-4,1e-5,1e-6,1e-7,1e-8,1e-9,1e-10],
        "col4": [0,4,2,5,6,7,6,3,6,2,1]
    }
)

I would like to select rows when the col4 value of the current row is greater than the col4 values of the previous and next rows and to store them in an empty frame.

I wrote the following code that works :

df1=pd.DataFrame()
for i in range(1,len(df)-1,1):
    if ( (df.iloc[i]['col4'] > df.iloc[i 1]['col4']) and (df.iloc[i]['col4'] > df.iloc[i-1]['col4']) ):
        df1=pd.concat([df1,df.iloc[i:i 1]])

I got the expected dataframe df1

    col1    col2    col3    col4
1   1   B   1.000000e-01    4
5   5   F   1.000000e-05    7
8   8   I   1.000000e-08    6

But this code is very ugly, not readable, ... Is there a best solution ?

CodePudding user response:

Use boolean indexing with compare next and previous values by Series.shift and Series.gt for greater values, for chain bitwise AND use &:

df = df[df['col4'].gt(df['col4'].shift()) & df['col4'].gt(df['col4'].shift(-1))]
print (df)
   col1 col2          col3  col4
1     1    B  1.000000e-01     4
5     5    F  1.000000e-05     7
8     8    I  1.000000e-08     6

EDIT: Solution for always include first and last rows:

mask = df['col4'].gt(df['col4'].shift()) & df['col4'].gt(df['col4'].shift(-1))
mask.iloc[[0, -1]] = True
df = df[mask]
print (df)
    col1 col2          col3  col4
0      0    A  1.000000e 00     0
1      1    B  1.000000e-01     4
5      5    F  1.000000e-05     7
8      8    I  1.000000e-08     6
10    10    K  1.000000e-10     1
  • Related