Find last occurrence of a max value in a row and get Column name in Pandas Python-CodePudding

I have the below dataframe called df:

Id	Stage1	Stage2	Stage3
1	2022-02-01	2020-04-03	2022-06-07
---	------------	------------	-----------
2	2023-06-07	2020-03-01	2020-09-03
---	------------	------------	-----------
3	2023-02-04	2023-06-07	2022-06-07
---	------------	------------	-----------
4	2022-05-08	2023-09-01	2023-09-01

I need to calculate the max date for each ID and its respective Stage. So for Order 1,2,3,4 the Stages I need are Stage 3, Stage 1, Stage 2, and Stage 3 respectively. I understand that using

df.filter(like="stage").idxmax(axis=1)

Finds the first occurrence of max date in a row and gives me its column name. However, for Order 4, Stage 2 and 3 have the same date. I need Stage 3 as my answer as Stage 3 is the latest stage of the order. How is this possible?

CodePudding user response：

Swap order of columns for match latest maximal value:

s = df.filter(like="Stage").iloc[:, ::-1].idxmax(axis=1)
print (s)
0    Stage3
1    Stage1
2    Stage2
3    Stage3
dtype: object

Details:

print (df.filter(like="Stage").iloc[:, ::-1])
      Stage3     Stage2     Stage1
0 2022-06-07 2020-04-03 2022-02-01
1 2020-09-03 2020-03-01 2023-06-07
2 2022-06-07 2023-06-07 2023-02-04
3 2023-09-01 2023-09-01 2022-05-08