Concatenation of multiple column values dataframe into one row if empty values may occur (pandas)-CodePudding

the task is simple, but I'm a little confused. Let's say I have a dataset where there are 3 columns. Sometimes one of them may contain an empty value. I need to combine these values into one row in a separate column.

Sample source dataframe:

df = pd.DataFrame({'col1':[np.NaN,'b_1','c_1'], 'col2':['a_2', np.NaN, 'c_2'],
                                             'col3':['a_3', 'b_3', np.NaN]})

col1    col2    col3
NaN     b_1     c_1
a_2     NaN     c_2
a_3     b_3     NaN

It is necessary that after the merging, the following is obtained (that is, non-empty values were combined and there was a comma between them) :

col1    col2    col3    сol4
NaN     b_1     c_1     b_1, c_1
a_2     NaN     c_2     a_2, c_2
a_3     b_3     NaN     a_3, b_3

I tried this earlier, but if the value is empty, there was just a space there. How can this command be improved?

df['col4'] = df['col1'].map(str)   ', '   df['col2'].map(str)   ', '   df['col3'].map(str)

col1    col2    col3    сol4
NaN     b_1     c_1     , b_1, c_1
a_2     NaN     c_2     a_2, ,c_2
a_3     b_3     NaN     a_3, b_3,

CodePudding user response：

Try this,

df['col4'] = df.apply(lambda d: ', '.join([x for x in d if x == x]), axis=1)

Sample Input:

  col1 col2 col3
0  NaN  a_2  a_3
1  b_1  NaN  b_3
2  c_1  c_2  NaN

Sample Output:

  col1 col2 col3      col4
0  NaN  a_2  a_3  a_2, a_3
1  b_1  NaN  b_3  b_1, b_3
2  c_1  c_2  NaN  c_1, c_2

What is suppose specific columns will be required?

df['col4'] = df[['col1', 'col2', 'col3']].apply(lambda d: ', '.join([x for x in d if x == x]), axis=1)

Sample Input:

  col1 col2 col3 colz
0  NaN  a_2  a_3  z_1
1  b_1  NaN  b_3  NaN
2  c_1  c_2  NaN  z_2

Sample Output:

  col1 col2 col3 colz      col4
0  NaN  a_2  a_3  z_1  a_2, a_3
1  b_1  NaN  b_3  NaN  b_1, b_3
2  c_1  c_2  NaN  z_2  c_1, c_2

NOTE: colz values are not added col4 result