How to reshape dataframe by creating a grouping multiheader from specific column?-CodePudding

I like to reshape a dataframe thats first column should be used to group the other columns by an additional header row.

Initial dataframe

df = pd.DataFrame(
    {
        'col1':['A','A','A','B','B','B'],
        'col2':[1,2,3,4,5,6],
        'col3':[1,2,3,4,5,6],
        'col4':[1,2,3,4,5,6],
        'colx':[1,2,3,4,5,6]
    }
)

Trial: Using pd.pivot() I can create an example, but this do not fit my expected one, it seems to be flipped in grouping:

df.pivot(columns='col1', values=['col2','col3','col4','colx'])

     col2      col3      col4      colx     
col1    A    B    A    B    A    B    A    B
0     1.0  NaN  1.0  NaN  1.0  NaN  1.0  NaN
1     2.0  NaN  2.0  NaN  2.0  NaN  2.0  NaN
2     3.0  NaN  3.0  NaN  3.0  NaN  3.0  NaN
3     NaN  4.0  NaN  4.0  NaN  4.0  NaN  4.0
4     NaN  5.0  NaN  5.0  NaN  5.0  NaN  5.0
5     NaN  6.0  NaN  6.0  NaN  6.0  NaN  6.0

Expected output:

        A                               B
col1    col2    col3    col4    colx    col2    col3    col4    colx
0       1       1       1       1       4       4       4       4
1       2       2       2       2       5       5       5       5
2       3       3       3       3       6       6       6       6

CodePudding user response：

Create counter column by GroupBy.cumcount, then use DataFrame.pivot with swapping level of MultiIndex in columns by DataFrame.swaplevel, sorting it and last remove index and columns names by DataFrame.rename_axis:

df = (df.assign(g = df.groupby('col1').cumcount())
        .pivot(index='g', columns='col1')
        .swaplevel(0,1,axis=1)
        .sort_index(axis=1)
        .rename_axis(index=None, columns=[None, None]))

print(df)
     A                   B               
  col2 col3 col4 colx col2 col3 col4 colx
0    1    1    1    1    4    4    4    4
1    2    2    2    2    5    5    5    5
2    3    3    3    3    6    6    6    6

CodePudding user response：

As an alternative to the classical pivot, you can concat the output of groupby with a dictionary comprehension, ensuring alignment with reset_index:

out = pd.concat({k: d.drop(columns='col1').reset_index(drop=True)
                 for k,d in df.groupby('col1')}, axis=1)

output:

     A                   B               
  col2 col3 col4 colx col2 col3 col4 colx
0    1    1    1    1    4    4    4    4
1    2    2    2    2    5    5    5    5
2    3    3    3    3    6    6    6    6