Home > Software engineering >  how to combine multiple columns and multiple rows based on group by of another column
how to combine multiple columns and multiple rows based on group by of another column

Time:09-23

I am trying to combine multiple columns and rows into single column based on group by field of ID column. The input is

|Id |   Sample_id | Sample_name |   Sample_number|
|:--|:------------|:-----------:|---------------:|             
|1  |  123        | Abcdef|ghij |  1234567       |
|1  |   345       | Vbnhj|tt|t  |   45678        |
|1  |   456       | Ffff|yyy|yy |   789000       |

Expected output : columns,rows belonging to same Id should be combined and form a list like below.

Id  Sample_details              
    123,Abcdef|ghij,1234567
 1  345,Vbnhj|tt|t, 45678       
    456 ,Ffff|yyy|yy,789000

 2   536 ,Ftff|uyy|iy,79000
     453, hnhj|tdd|rr, 67678 
   

I tried below which is not working

df.groupby('Id')['Sample_id']['Sample_name']['Sample_number'].apply(','.join).reset_index()

CodePudding user response:

Try:

df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)

out = df.groupby('Id')['Sample_details'].apply('\n'.join).reset_index()

Output:

>>> out
   Id                                     Sample_details
0   1  123,Abcdef|ghij,1234567\n345,Vbnhj|tt|t,45678\...

Note: Pandas does not interpret escape sequence.

CodePudding user response:

df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)

below solution helped me merging rows after combining columns by df.filter

df.groupby(['Id']).agg({'Sample_details': lambda x: ",".join(x)}).reset_index()

  • Related