Home > Back-end >  Filter out rows with common field where at least one fulfills a condition
Filter out rows with common field where at least one fulfills a condition

Time:03-19

I have data like this:

Task ID Status
Task1 123 Open
Task2 123 Closed
Task3 211 Closed
Task4 211 Closed
Task5 564 Closed
Task6 994 Open

I want to delete rows with the same IDs that have 'Open' status. In other words I want to delete all IDs that have 'Open' status.

The end result will be like this:

Task ID Status
Task3 211 Closed
Task4 211 Closed
Task5 564 Closed

Data:

{'Task': ['Task1', 'Task2', 'Task3', 'Task4', 'Task5', 'Task6'],
 'ID': [123, 123, 211, 211, 564, 994],
 'Status': ['Open', 'Closed', 'Closed', 'Closed', 'Closed', 'Open']}

CodePudding user response:

We could take the Open Statuses and groupby cummax to create a boolean filter.

The idea is if a Status is Open, we flag its corresponding ID as True for all rows it appears, then we filter out all such rows:

out = df[~df['Status'].eq('Open').groupby(df['ID']).cummax()]

Output:

    Task   ID  Status
2  Task3  211  Closed
3  Task4  211  Closed
4  Task5  564  Closed
  • Related