Home > other >  Find last occurrence of a max value in a row and get Column name in Pandas Python
Find last occurrence of a max value in a row and get Column name in Pandas Python

Time:09-21

I have the below dataframe called df:

Id Stage1 Stage2 Stage3
1 2022-02-01 2020-04-03 2022-06-07
--- ------------ ------------ -----------
2 2023-06-07 2020-03-01 2020-09-03
--- ------------ ------------ -----------
3 2023-02-04 2023-06-07 2022-06-07
--- ------------ ------------ -----------
4 2022-05-08 2023-09-01 2023-09-01

I need to calculate the max date for each ID and its respective Stage. So for Order 1,2,3,4 the Stages I need are Stage 3, Stage 1, Stage 2, and Stage 3 respectively. I understand that using

df.filter(like="stage").idxmax(axis=1)

Finds the first occurrence of max date in a row and gives me its column name. However, for Order 4, Stage 2 and 3 have the same date. I need Stage 3 as my answer as Stage 3 is the latest stage of the order. How is this possible?

CodePudding user response:

Swap order of columns for match latest maximal value:

s = df.filter(like="Stage").iloc[:, ::-1].idxmax(axis=1)
print (s)
0    Stage3
1    Stage1
2    Stage2
3    Stage3
dtype: object

Details:

print (df.filter(like="Stage").iloc[:, ::-1])
      Stage3     Stage2     Stage1
0 2022-06-07 2020-04-03 2022-02-01
1 2020-09-03 2020-03-01 2023-06-07
2 2022-06-07 2023-06-07 2023-02-04
3 2023-09-01 2023-09-01 2022-05-08
  • Related