how to combine multiple columns and multiple rows based on group by of another column-CodePudding

I am trying to combine multiple columns and rows into single column based on group by field of ID column. The input is

|Id |   Sample_id | Sample_name |   Sample_number|
|:--|:------------|:-----------:|---------------:|             
|1  |  123        | Abcdef|ghij |  1234567       |
|1  |   345       | Vbnhj|tt|t  |   45678        |
|1  |   456       | Ffff|yyy|yy |   789000       |

Expected output : columns,rows belonging to same Id should be combined and form a list like below.

Id  Sample_details              
    123,Abcdef|ghij,1234567
 1  345,Vbnhj|tt|t, 45678       
    456 ,Ffff|yyy|yy,789000

 2   536 ,Ftff|uyy|iy,79000
     453, hnhj|tdd|rr, 67678

I tried below which is not working

df.groupby('Id')['Sample_id']['Sample_name']['Sample_number'].apply(','.join).reset_index()

CodePudding user response：

Try:

df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)

out = df.groupby('Id')['Sample_details'].apply('\n'.join).reset_index()

Output:

>>> out
   Id                                     Sample_details
0   1  123,Abcdef|ghij,1234567\n345,Vbnhj|tt|t,45678\...

Note: Pandas does not interpret escape sequence.

CodePudding user response：

df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)

below solution helped me merging rows after combining columns by df.filter

df.groupby(['Id']).agg({'Sample_details': lambda x: ",".join(x)}).reset_index()