Home > Software design >  How to pull a column based on the latest date
How to pull a column based on the latest date

Time:09-27

I have data which looks like this. Task Status would be an empty column initially.

Name   ID     Date           Task_Status
Tom    123    9/24/2022     
Sam    123    9/25/2022    
Kim    123    9/26/2022 

Whenever the ID is the same, I want to assign the Task_Status with the latest date = "OPEN" and the remaining Task_Status = "Closed".

How do I do this for a pandas Dataframe?

This looks something like this.

Name   ID     Date           Task_Status
Tom    123    9/24/2022       CLOSED
Sam    123    9/25/2022       CLOSED
Kim    123    9/26/2022        OPEN

Thank you in advance ! this seems like a simple task but I am not sure how to go about this.

CodePudding user response:

df.sort_values(by=['Date'], ascending=False)

used short method of Dataframe

CodePudding user response:

You can first create a new column with 'CLOSED' as values and then use idxmax to identify max index and assign the 'OPEN' value to the Task_Status:

df['Task_Status'] = 'CLOSED'
df.loc[pd.to_datetime(df['Date']).idxmax(), 'Task_Status'] = 'OPEN'

print(df)

  Name   ID       Date Task_Status
0  Tom  123  9/24/2022      CLOSED
1  Sam  123  9/25/2022      CLOSED
2  Kim  123  9/26/2022        OPEN
  • Related