Filter out rows with common field where at least one fulfills a condition-CodePudding

I have data like this:

Task	ID	Status
Task1	123	Open
Task2	123	Closed
Task3	211	Closed
Task4	211	Closed
Task5	564	Closed
Task6	994	Open

I want to delete rows with the same IDs that have 'Open' status. In other words I want to delete all IDs that have 'Open' status.

The end result will be like this:

Task	ID	Status
Task3	211	Closed
Task4	211	Closed
Task5	564	Closed

Data:

{'Task': ['Task1', 'Task2', 'Task3', 'Task4', 'Task5', 'Task6'],
 'ID': [123, 123, 211, 211, 564, 994],
 'Status': ['Open', 'Closed', 'Closed', 'Closed', 'Closed', 'Open']}

CodePudding user response：

We could take the Open Statuses and groupby cummax to create a boolean filter.

The idea is if a Status is Open, we flag its corresponding ID as True for all rows it appears, then we filter out all such rows:

out = df[~df['Status'].eq('Open').groupby(df['ID']).cummax()]

Output:

    Task   ID  Status
2  Task3  211  Closed
3  Task4  211  Closed
4  Task5  564  Closed