Home > Net >  Name of first column that is non NaN ir Null
Name of first column that is non NaN ir Null

Time:11-02

I'm looking how to add a column to df that holds the column name of the first column that is not NaN or Null. Example: column "first" based on columns 'A', 'B', 'C', 'D'

Name    A     B   C   D   First

Alex    100  Nan Null NaN   A

Michael 300  400 Null Null  A

Steve   Null 100 100  100   B

John    Null Null Nan Null Null

CodePudding user response:

Convert Null, Nan values to misisng values and if necessary Name to index, then test DataFrame.notna and get first matched column by DataFrame.idxmax, last set NaN if all NaNs per row:

df = df.replace(['Null', 'Nan'], np.nan)
df = df.set_index('Name')

m = df.notna()
df['First'] = m.idxmax(axis=1).where(m.any(axis=1))
print (df)
           A    B    C    D First
Name                             
Alex     100  Nan  NaN  NaN     A
Michael  300  400  NaN  NaN     A
Steve    NaN  100  100  100     B
John     NaN  NaN  NaN  NaN   NaN

If need original values without replace:

df = df.set_index('Name')
m = df.isin(['Null', 'Nan']) | df.isna()

df['First'] = (~m).idxmax(axis=1).mask(m.all(axis=1), 'Null')
  • Related