Pandas - Unstack/pivot a dataframe with pandas-CodePudding

I have a dataframe that looks like this:

Column A	Column B	Category
1	7	A
2	8	A
3	9	B
4	10	B
5	11	C
6	12	C

I would like to write code to produce the following dataframe:

Category A		Category B		Category C
Column A	Column B	Column A	Column B	Column A	Column B
1	7	3	9	5	11
2	8	4	10	6	12

I've tried pd.pivot_table, but am not able to figure it out. Can someone help me with this please? Thanks!

CodePudding user response：

You can create a dummy index to use pivot table with:

out = df.pivot_table(
    columns="Category",
    index=df.groupby("Category").cumcount()
)

which has output:

         Column A       Column B
Category        A  B  C        A   B   C
0               1  3  5        7   9  11
1               2  4  6        8  10  12

I don't know if there's any simple way to rearrange the columns to be in your format within pivot_table itself. Here is a way by doing some post processing:

final = out.swaplevel(axis=1).sort_index(axis=1, level=0)

final:

Category        A                 B                 C
         Column A Column B Column A Column B Column A Column B
0               1        7        3        9        5       11
1               2        8        4       10        6       12

CodePudding user response：

The issue is that you cannot identify each row uniquely to be able to apply pivot. To this end, create a "within-group" index as follows.

from io import StringIO
import pandas as pd

# setup sample data
data = StringIO("""
Column A;Column B;Category
1;7;A
2;8;A
3;9;B
4;10;B
5;11;C
6;12;C
"""
)
df = pd.read_csv(data, sep=";")

# assign a within-group index
df['id'] = df.groupby('Category').cumcount()

# now apply pivot
df = df.pivot(index='id', columns='Category', values=['Column A', 'Column B'])

Now, you can apply swaplevel and sort_index to match the desired result

df.swaplevel(axis=1).sort_index(axis=1)