I need to combine rows according to their rows (a simple example below based on row'id'):
id unit amount
1 m 10
1 kg 3
2 m 4
3 number 5
3 kg 7
3 m 6
I want it converted to:
id unit amount unit amount unit amount
1 m 10 kg 3
2 m 4
3 number 5 kg 7 m 6
which stores information in new same name rows.
I understand groupby
and join
could combine the rows into the preexisted columns but how to store them into new created ones?
CodePudding user response:
You can use pivot
:
df.assign(col=df.groupby('id').cumcount()).pivot('id', 'col')
unit amount
col 0 1 2 0 1 2
id
1 m kg NaN 10.0 3.0 NaN
2 m NaN NaN 4.0 NaN NaN
3 number kg m 5.0 7.0 6.0
If You want it exactly like you said then:
(df.assign(col=df.groupby('id').cumcount())
.pivot('id', 'col').reorder_levels([1,0], axis =1)
.sort_index(axis=1).droplevel(0, axis=1).reset_index())
id amount unit amount unit amount unit
0 1 10.0 m 3.0 kg NaN NaN
1 2 4.0 m NaN NaN NaN NaN
2 3 5.0 number 7.0 kg 6.0 m
if you like, you can use pivot_wider
from pyjanitor
:
import janitor
df.assign(col=df.groupby('id').cumcount()).pivot_wider('id', 'col')
id unit_0 unit_1 unit_2 amount_0 amount_1 amount_2
0 1 m kg NaN 10.0 3.0 NaN
1 2 m NaN NaN 4.0 NaN NaN
2 3 number kg m 5.0 7.0 6.0