I have data which looks like this. Task Status would be an empty column initially.
Name ID Date Task_Status
Tom 123 9/24/2022
Sam 123 9/25/2022
Kim 123 9/26/2022
Whenever the ID is the same, I want to assign the Task_Status with the latest date = "OPEN" and the remaining Task_Status = "Closed".
How do I do this for a pandas Dataframe?
This looks something like this.
Name ID Date Task_Status
Tom 123 9/24/2022 CLOSED
Sam 123 9/25/2022 CLOSED
Kim 123 9/26/2022 OPEN
Thank you in advance ! this seems like a simple task but I am not sure how to go about this.
CodePudding user response:
df.sort_values(by=['Date'], ascending=False)
used short method of Dataframe
CodePudding user response:
You can first create a new column with 'CLOSED' as values and then use idxmax
to identify max index and assign the 'OPEN' value to the Task_Status:
df['Task_Status'] = 'CLOSED'
df.loc[pd.to_datetime(df['Date']).idxmax(), 'Task_Status'] = 'OPEN'
print(df)
Name ID Date Task_Status
0 Tom 123 9/24/2022 CLOSED
1 Sam 123 9/25/2022 CLOSED
2 Kim 123 9/26/2022 OPEN