Home > front end >  How to combine data of multiple rows in one row
How to combine data of multiple rows in one row

Time:05-13

I have a dataset with names of organisations and codes. Some organisations have multiple codes, some have only one code. I want to make a set that shows the organisation in one column, and all the codes of that organisation in another column.

This is how the dataset is right now:

enter image description here

And this is how it should be:

enter image description here

Does anyone know what script in Python I could use for this?

CodePudding user response:

df.groupby("organisation").code.apply(list)

Note that it produce a pd.Series. If you want to convert it in a pd.DataFrame use the .to_frame() method.

CodePudding user response:

You can try

out = (df.astype({'code': str})
       .groupby('organisation', as_index=False)['code']
       .apply(', '.join))
print(out)

  organisation           code
0            A  100, 101, 102
1            B            103
2            C       104, 105
3            D            106
  • Related