Given a column value, check if another column value is present in preceding or next 'n' ro-CodePudding

I have the following data

jsonDict = {'Fruit': ['apple', 'orange', 'apple', 'banana', 'orange', 'apple','banana'], 'price': [1, 2, 1, 3, 2, 1, 3]}

    Fruit  price
0   apple      1
1  orange      2
2   apple      1
3  banana      3
4  orange      2
5   apple      1
6  banana      3

What I want to do is check if Fruit == banana and if yes, I want the code to scan the preceding as well as the next n rows from the index position of the 'banana' row, for an instance where Fruit == apple. An example of the expected output is shown below taking n=2.

    Fruit  price
2   apple      1
5   apple      1

I have tried doing

position = df[df['Fruit'] == 'banana'].index
resultdf= df.loc[((df.index).isin(position)) & (((df['Fruit'].index 2).isin(['apple']))|((df['Fruit'].index-2).isin(['apple'])))]

# Output is an empty dataframe 
Empty DataFrame
Columns: [Fruit, price]
Index: []

Preference will be given to vectorized approaches.

CodePudding user response：

IIUC, you can use 2 masks and boolean indexing:

# df = pd.DataFrame(jsonDict)

n = 2
m1 = df['Fruit'].eq('banana')
# is the row ±n of a banana?
m2 = m1.rolling(2*n 1, min_periods=1, center=True).max().eq(1)
# is the row an apple?
m3 = df['Fruit'].eq('apple')

out = df[m2&m3]

output:

   Fruit  price
2  apple      1
5  apple      1