Home > Software design >  Python: Extract column name for earliest and latest columns
Python: Extract column name for earliest and latest columns

Time:07-05

I have a dataset with multiple date columns as follows

Input Dataframe

> df
Date1       Date2       Date3
2008-08-11  NaT         2005-05-14
2010-06-18  2012-09-04  2003-01-16
NaT         2006-11-05  1999-03-31

Desired Output

I want to add a column minDatecol

> df
Date1       Date2       Date3       minDatecol
2008-08-11  NaT         2005-05-14  Date3
2010-06-18  2012-09-04  2003-01-16  Date3 
NaT         2006-11-05  2009-03-31  Date2

I tried df[cols].idxmin(axis=1) but getting an error as follows

ValueError: attempt to get argmin of an empty sequence

Any help you can provide?

CodePudding user response:

Use:

cols = ['Date1','Date2','Date3']
df['minDatecol'] = df[cols].fillna(pd.Timestamp.max).idxmin(axis=1)
print (df)
       Date1      Date2      Date3 minDatecol
0 2008-08-11        NaT 2005-05-14      Date3
1 2010-06-18 2012-09-04 2003-01-16      Date3
2        NaT 2006-11-05 2009-03-31      Date2
  • Related