Home > database >  Pandas - syntax of merging data from multiple rows into a single row
Pandas - syntax of merging data from multiple rows into a single row

Time:08-24

I am trying merge data from multiple rows into a single row with multiple columns. I can only manage one column to work when I want multiple.

Given below is how the data looks like :

column1 column2 column3
1       a       Y
1       b       Z
2       d       Z
2       c       Y

Expected output:

column1 column2 column3
1       a, b       Y, Z
2       d, c       Z, Y

Using df.groupby('column1')['column2'].apply(', '.join).reset_index() I've got the output:

column1 column2 
1       a, b      
2       d, c    

But unsure the syntax for adding column3.

df.groupby('column1')['column2', 'column3'].apply(', '.join).reset_index() doesn't work.

Neither does df.groupby('column1')[['column2', 'column3']].apply(', '.join).reset_index().

CodePudding user response:

You can try agg for multiple concatenations:

df.groupby('column1').agg(lambda x: ', '.join(x)).reset_index()
  • Related