Home > Software design >  Make dataframe header as rows and row as header
Make dataframe header as rows and row as header

Time:09-29

I have a dataframe where I want to convert the columns headers into aggregated rows and a column into the header:

Athlete  Variable    28.0   28.0  28.0  29.0   29.0   29.0

John     Minutes     23     32    43    21     43     35
John     Distance    23.2   32.7  54.2  24.2   47.8   32.1
John     Sprints     5      3     8     3      9      2
Bob      Minutes     25     23    43    31     38     23
Bob      Distance    32.4   35.6  43.7  21.8   26.8   12.5
Bob      Sprints     8      4     8     6      9      2

Desired dataframe:

Athlete   week    Minutes   Distance   Sprints

John      28      98        110.1      16
John      29      99        104.1      14
Bob       28      91        111.7      20
Bob       29      92        61.1       17

CodePudding user response:

You’re looking for pivot_table, but you’ll need to melt the dataframe to a mono-dimensional format first:

>>> data = df.melt(id_vars=['Athlete', 'Variable'], var_name='week')
>>> data
   Athlete  Variable  week  value
0     John   Minutes  28.0   23.0
1     John  Distance  28.0   23.2
2     John   Sprints  28.0    5.0
3      Bob   Minutes  28.0   25.0
4      Bob  Distance  28.0   32.4
5      Bob   Sprints  28.0    8.0
6     John   Minutes  28.0   32.0
7     John  Distance  28.0   32.7
8     John   Sprints  28.0    3.0
9      Bob   Minutes  28.0   23.0
10     Bob  Distance  28.0   35.6
11     Bob   Sprints  28.0    4.0
12    John   Minutes  28.0   43.0
13    John  Distance  28.0   54.2
14    John   Sprints  28.0    8.0
15     Bob   Minutes  28.0   43.0
16     Bob  Distance  28.0   43.7
17     Bob   Sprints  28.0    8.0
18    John   Minutes  29.0   21.0
19    John  Distance  29.0   24.2
20    John   Sprints  29.0    3.0
21     Bob   Minutes  29.0   31.0
22     Bob  Distance  29.0   21.8
23     Bob   Sprints  29.0    6.0
24    John   Minutes  29.0   43.0
25    John  Distance  29.0   47.8
26    John   Sprints  29.0    9.0
27     Bob   Minutes  29.0   38.0
28     Bob  Distance  29.0   26.8
29     Bob   Sprints  29.0    9.0
30    John   Minutes  29.0   35.0
31    John  Distance  29.0   32.1
32    John   Sprints  29.0    2.0
33     Bob   Minutes  29.0   23.0
34     Bob  Distance  29.0   12.5
35     Bob   Sprints  29.0    2.0

So using pivot_table, you have to specify the aggregation function, here we use sum, the default being mean:

>>> data.pivot_table(index=['Athlete', 'week'], columns='Variable', values='value', aggfunc='sum')
Variable      Distance  Minutes  Sprints
Athlete week                            
Bob     28.0     111.7     91.0     20.0
        29.0      61.1     92.0     17.0
John    28.0     110.1     98.0     16.0
        29.0     104.1     99.0     14.0

CodePudding user response:

Try with groupby first then do pivot and stack

out = df.groupby(level=0,axis=1).sum().pivot('Athlete','Variable').\
          rename_axis(['week',None],axis=1).stack(level=0).reset_index()
Out[466]: 
  Athlete  week  Distance  Minutes  Sprints
0     Bob  28.0     111.7     91.0     20.0
1     Bob  29.0      61.1     92.0     17.0
2    John  28.0     110.1     98.0     16.0
3    John  29.0     104.1     99.0     14.0
  • Related