I have a dataframe as follows;
df_d = {'consolidate_elem': {0: np.nan,
1: np.nan,
2: np.nan,
3: np.nan,
4: np.nan,
5: np.nan,
6: np.nan,
7: np.nan,
8: np.nan,
9: np.nan,
10: np.nan},
'A': {0: np.nan,
1: np.nan,
2: '6/9/1972',
3: '9/4/1943',
4: '1/29/1944',
5: '7/31/1965',
6: '4/5/1979',
7: np.nan,
8: '3/17/2000',
9: '3/18/2000',
10: '3/17/2000'},
'B': {0: np.nan,
1: np.nan,
2: np.nan,
3: 'Yes',
4: np.nan,
5: np.nan,
6: np.nan,
7: np.nan,
8: np.nan,
9: np.nan,
10: np.nan},
'C': {0: np.nan,
1: np.nan,
2: np.nan,
3: np.nan,
4: 'Yes',
5: 'Yes',
6: np.nan,
7: 'Yes',
8: 'Yes',
9: 'Yes',
10: 'Yes'},
'D': {0: '11100',
1: '11721',
2: np.nan,
3: '55626',
4: '5,562,655,626.00',
5: np.nan,
6: np.nan,
7: '33050',
8: '33050',
9: '33050',
10: np.nan}}
Here i would like to fill in a column consolidate_elem with a column names whose rows are not NAN's in above data frame. the expected output will be as below;
So here in first row we have a value in D column so in consolidate_elem it is filled as D.
Similarly in index -4 it is as A;C;D as they have values.
CodePudding user response:
Use DataFrame.dot
for matrix multiplication by mask created by compared not missing values by DataFrame.notna
with omit first column:
df = pd.DataFrame(df_d)
df['consolidate_elem'] = df.iloc[:, 1:].notna().dot(df.columns[1:] ';').str.strip(';')
print (df)
consolidate_elem A B C D
0 D NaN NaN NaN 11100
1 D NaN NaN NaN 11721
2 A 6/9/1972 NaN NaN NaN
3 A;B;D 9/4/1943 Yes NaN 55626
4 A;C;D 1/29/1944 NaN Yes 5,562,655,626.00
5 A;C 7/31/1965 NaN Yes NaN
6 A 4/5/1979 NaN NaN NaN
7 C;D NaN NaN Yes 33050
8 A;C;D 3/17/2000 NaN Yes 33050
9 A;C;D 3/18/2000 NaN Yes 33050
10 A;C 3/17/2000 NaN Yes NaN
Or with droping column by name:
df['consolidate_elem'] = (df.drop('consolidate_elem', axis=1).notna()
.dot(df.columns.drop('consolidate_elem') ';')
.str.strip(';'))