How to fill in a column with column names whose rows are not NULL in Pandas?-CodePudding

I have a dataframe as follows;

df_d = {'consolidate_elem': {0: np.nan,
  1: np.nan,
  2: np.nan,
  3: np.nan,
  4: np.nan,
  5: np.nan,
  6: np.nan,
  7: np.nan,
  8: np.nan,
  9: np.nan,
  10: np.nan},
 'A': {0: np.nan,
  1: np.nan,
  2: '6/9/1972',
  3: '9/4/1943',
  4: '1/29/1944',
  5: '7/31/1965',
  6: '4/5/1979',
  7: np.nan,
  8: '3/17/2000',
  9: '3/18/2000',
  10: '3/17/2000'},
 'B': {0: np.nan,
  1: np.nan,
  2: np.nan,
  3: 'Yes',
  4: np.nan,
  5: np.nan,
  6: np.nan,
  7: np.nan,
  8: np.nan,
  9: np.nan,
  10: np.nan},
 'C': {0: np.nan,
  1: np.nan,
  2: np.nan,
  3: np.nan,
  4: 'Yes',
  5: 'Yes',
  6: np.nan,
  7: 'Yes',
  8: 'Yes',
  9: 'Yes',
  10: 'Yes'},
 'D': {0: '11100',
  1: '11721',
  2: np.nan,
  3: '55626',
  4: '5,562,655,626.00',
  5: np.nan,
  6: np.nan,
  7: '33050',
  8: '33050',
  9: '33050',
  10: np.nan}}

Here i would like to fill in a column consolidate_elem with a column names whose rows are not NAN's in above data frame. the expected output will be as below;

So here in first row we have a value in D column so in consolidate_elem it is filled as D.

Similarly in index -4 it is as A;C;D as they have values.

CodePudding user response：

Use DataFrame.dot for matrix multiplication by mask created by compared not missing values by DataFrame.notna with omit first column:

df = pd.DataFrame(df_d)

df['consolidate_elem'] = df.iloc[:, 1:].notna().dot(df.columns[1:]   ';').str.strip(';')
print (df)
   consolidate_elem          A    B    C                 D
0                 D        NaN  NaN  NaN             11100
1                 D        NaN  NaN  NaN             11721
2                 A   6/9/1972  NaN  NaN               NaN
3             A;B;D   9/4/1943  Yes  NaN             55626
4             A;C;D  1/29/1944  NaN  Yes  5,562,655,626.00
5               A;C  7/31/1965  NaN  Yes               NaN
6                 A   4/5/1979  NaN  NaN               NaN
7               C;D        NaN  NaN  Yes             33050
8             A;C;D  3/17/2000  NaN  Yes             33050
9             A;C;D  3/18/2000  NaN  Yes             33050
10              A;C  3/17/2000  NaN  Yes               NaN

Or with droping column by name:

df['consolidate_elem'] = (df.drop('consolidate_elem', axis=1).notna()
                            .dot(df.columns.drop('consolidate_elem')   ';')
                            .str.strip(';'))