I am trying to combine multiple columns and rows into single column based on group by field of ID column. The input is
|Id | Sample_id | Sample_name | Sample_number|
|:--|:------------|:-----------:|---------------:|
|1 | 123 | Abcdef|ghij | 1234567 |
|1 | 345 | Vbnhj|tt|t | 45678 |
|1 | 456 | Ffff|yyy|yy | 789000 |
Expected output : columns,rows belonging to same Id should be combined and form a list like below.
Id Sample_details
123,Abcdef|ghij,1234567
1 345,Vbnhj|tt|t, 45678
456 ,Ffff|yyy|yy,789000
2 536 ,Ftff|uyy|iy,79000
453, hnhj|tdd|rr, 67678
I tried below which is not working
df.groupby('Id')['Sample_id']['Sample_name']['Sample_number'].apply(','.join).reset_index()
CodePudding user response:
Try:
df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)
out = df.groupby('Id')['Sample_details'].apply('\n'.join).reset_index()
Output:
>>> out
Id Sample_details
0 1 123,Abcdef|ghij,1234567\n345,Vbnhj|tt|t,45678\...
Note: Pandas does not interpret escape sequence.
CodePudding user response:
df['Sample_details'] = df.filter(like='Sample_').astype(str).apply(','.join, axis=1)
below solution helped me merging rows after combining columns by df.filter
df.groupby(['Id']).agg({'Sample_details': lambda x: ",".join(x)}).reset_index()