I have a dataframe like this:
A | B | C |
---|---|---|
One | 12 | 13 |
One | 10 | 9 |
One | 3 | 6 |
Two | 11 | 7 |
Two | 14 | 12 |
Three | 12 | 8 |
Three | 1 | 9 |
Three | 3 | 4 |
I need to transform the dataframe as this:
A | B | C | B2 | C2 | B3 | C3 |
---|---|---|---|---|---|---|
One | 12 | 13 | 10 | 9 | 3 | 6 |
Two | 11 | 7 | 14 | 12 | ||
Three | 12 | 8 | 1 | 9 | 3 | 4 |
I would be very grateful if you could help me.
Thank you
CodePudding user response:
groupby
with cumcount
df.set_index(['A', df.groupby('A').cumcount()]).unstack()
B C
0 1 2 0 1 2
A
One 12.0 10.0 3.0 13.0 9.0 6.0
Three 12.0 1.0 3.0 8.0 9.0 4.0
Two 11.0 14.0 NaN 7.0 12.0 NaN
Nailing down actual desired output
d0 = df.set_index(['A', df.groupby('A').cumcount()]).unstack()
d0.columns = [f'{tup[0]}{"" if tup[1] == 0 else tup[1]}' for tup in d0.columns]
d0.reset_index()
A B B1 B2 C C1 C2
0 One 12.0 10.0 3.0 13.0 9.0 6.0
1 Three 12.0 1.0 3.0 8.0 9.0 4.0
2 Two 11.0 14.0 NaN 7.0 12.0 NaN
CodePudding user response:
Try with stack
and pivot
:
stacked = df.set_index("A").stack().reset_index()
stacked["Column"] = stacked["level_1"].add(stacked.groupby(["A","level_1"]).cumcount().add(1).astype(str))
output = stacked.pivot("A","Column",0).rename_axis(None,axis=1)
>>> output
B1 B2 B3 C1 C2 C3
A
One 12.0 10.0 3.0 13.0 9.0 6.0
Three 12.0 1.0 3.0 8.0 9.0 4.0
Two 11.0 14.0 NaN 7.0 12.0 NaN
Input df:
df = pd.DataFrame({'A': ['One', 'One', 'One', 'Two', 'Two', 'Three', 'Three', 'Three'],
'B': [12, 10, 3, 11, 14, 12, 1, 3],
'C': [13, 9, 6, 7, 12, 8, 9, 4]})