Home > front end >  Filter out all rows based on condition and delete rows fulfilling the condition in pandas
Filter out all rows based on condition and delete rows fulfilling the condition in pandas

Time:03-18

I have data like this:

Task ID Status
Task1 123 Open
Task2 123 Closed
Task3 211 Closed
Task4 211 Closed
Task5 564 Closed
Task6 994 Open

I want to delete rows with the same IDs that have 'Open' status. In other words I want to delete all IDs that have 'Open' status.

The end result will be like this:

Task ID Status
Task3 211 Closed
Task4 211 Closed
Task5 564 Closed

Data:

{'Task': ['Task1', 'Task2', 'Task3', 'Task4', 'Task5', 'Task6'],
 'ID': [123, 123, 211, 211, 564, 994],
 'Status': ['Open', 'Closed', 'Closed', 'Closed', 'Closed', 'Open']}

CodePudding user response:

We could take the Open Statuses and groupby cummax to create a boolean filter.

The idea is if a Status is Open, we flag its corresponding ID as True for all rows it appears, then we filter out all such rows:

out = df[~df['Status'].eq('Open').groupby(df['ID']).cummax()]

Output:

    Task   ID  Status
2  Task3  211  Closed
3  Task4  211  Closed
4  Task5  564  Closed
  • Related