How to combine or join unique values from different columns in DataFrame?-CodePudding

This is my Dataset and I want to combine Symptom_1, Symptom_2, Symptom_3

This is my Dataset and I want to combine Symptom_1, Symptom_2, Symptom_3. The code that I used to combine them is

df['Symptom'] = df.apply(lambda row: row.Symptom_1   row.Symptom_2   row.Symptom_3, axis=1,)
df.head()

But the problem in this is that the Symptom column does not contain a seperator ',' between values

CodePudding user response：

IIUC, you can try

df['Symptom'] = df.apply(lambda row: ','.join([row.Symptom_1, row.Symptom_2, row.Symptom_3]), axis=1,)
# or
df['Symptom'] = df.apply(lambda row: ','.join(row.filter(like='Symptom')), axis=1,)
# or
df['Symptom'] = df.filter(like='Symptom').apply(','.join, axis=1)

If you only want to keep the unique value, you can try

df['Symptom'] = df.apply(lambda row: ','.join(row.filter(like='Symptom').drop_duplicates()), axis=1,)

CodePudding user response：

just add separator inbetween

df['Symptom'] = df.apply(lambda row: row.Symptom_1   ','   row.Symptom_2    ','   row.Symptom_3, axis=1,)

CodePudding user response：

df['Symptom'] = df['Symptom_1'].str.cat(df[['Symptom_2', 'Symptom_3']], sep=', ')