Home > Blockchain >  Filtering rows before a certain criteria by group in pandas
Filtering rows before a certain criteria by group in pandas

Time:04-10

I am looking to figure out how I can filter rows in pandas based on a certain criteria that changes for each group. See dummy data here and the desired result.

Dummy data

data = {'ID':['1111', '1111', '1111', '1111','1112','1112','1112','1112','1112'],
        'Category':[1,2,2,2,1,3,2,2,1]}
pd.DataFrame(data)
ID      Category
1111    1
1111    2
1111    2
1111    2
1112    1
1112    3
1112    2
1112    2
1112    1

Now I am trying to filter the dataframe such that for each ID it picks all the rows before the first occurrence of Category 2. This is how the desired result would look like

ID      Category
1111    1
1112    1
1112    3

CodePudding user response:

Try this:

df[(df['Category'] != 2).groupby(df['ID']).cumprod()] 

Output:

     ID  Category
0  1111         1
4  1112         1
5  1112         3

Details:
Create a boolean series where True when Category is not equal to 2, then use cumprod such that the first False Category = 2, it remains False for each group.

CodePudding user response:

df = pd.DataFrame(data)
df.loc[df.Category.eq(2).groupby(df.ID).cumsum().eq(0)]
#      ID  Category
# 0  1111         1
# 4  1112         1
# 5  1112         3
  • Related