Home > database >  How to create multiple columns in Pandas Dataframe?
How to create multiple columns in Pandas Dataframe?

Time:10-23

my question is simple: I have data as you can see in the "terminal" I need it to be converted to the excel sheet format as you can see in the excel sheet file by creating multi-levels in columns.

I googled for hours and reach many different things but cannot achieve my goal then, I reached "transpose", and it gave me the shape that I need but unfortunately that it did reshape from a column to a row instead where I got the wrong data ordering.

Current result:
enter image description here

Desired result:
enter image description here

so, can anyone help me with that, please?

thanks in advance

CodePudding user response:

This can be done with unstack after setting player__number as index. You have to reorder the Multiindex columns and fill missing values/delete duplicates though:

import pandas as pd

data = {"player__number": [2, 10 , 2, 10, 2, 10],
        "cel1": [0, 2, 0, 1, 0, 1],
        "cel2": [0, 5, 0, 1, 0, 2],
        "cel3": [1, 3, 0, 1, 0, 1],
}

df = pd.DataFrame(data).set_index('player__number', append=True)
df = df.unstack('player__number').reorder_levels([1, 0], axis=1).sort_index(axis=1) # unstacking, reordering and sorting columns
df = df.ffill().iloc[1::2].reset_index(drop=True) # filling values and keeping only every two rows
df.to_excel('output.xlsx')

Output:

enter image description here

CodePudding user response:

You can use pivot() function and reorder multi-column levels.

Before that, index/group data for repeated iterations/rounds:

data=[
    (2,0,0,1),
    (10,2,5,3),
    (2,0,0,0),
    (10,1,1,1),
    (2,0,0,0),
    (10,1,2,1),
]

columns = ["player_number", "cel1", "cel2", "cel3"]

df = pd.DataFrame(data=data, columns=columns)

df_nbr_plr = df[["player_number"]].groupby("player_number").agg(cnt=("player_number","count"))
df["round"] = list(itertools.chain.from_iterable(itertools.repeat(x, df_nbr_plr.shape[0]) for x in range(df_nbr_plr.iloc[0,0])))

[Out]:
   player_number  cel1  cel2  cel3  round
0              2     0     0     1      0
1             10     2     5     3      0
2              2     0     0     0      1
3             10     1     1     1      1
4              2     0     0     0      2
5             10     1     2     1      2

Now, pivot and reorder the colums levels:

df = df.pivot(index="round", columns="player_number").reorder_levels([1,0], axis=1).sort_index(axis=1)

[Out]:
player_number   2              10          
              cel1 cel2 cel3 cel1 cel2 cel3
round                                      
0                0    0    1    2    5    3
1                0    0    0    1    1    1
2                0    0    0    1    2    1
  • Related