Home > Software design >  Transpose Pandas Dataframe Python
Transpose Pandas Dataframe Python

Time:09-21

I have this following dataframe:

  Status    Percentage  Value   Name    Tahun
0   X       66.666667    4.0     A      2021
1   Y       33.333333    2.0     A      2021
2   Z       0.000000     0.0     A      2021
0   X       25.000000    2.0     A      2020
1   Y       62.500000    5.0     A      2020
2   Z       12.500000    1.0     A      2020           

I want to transpose the dataframe and change the column header to Status values. Ideally the output should look like

X            Y           Z          Type         Name    Tahun
66.666667    33.333333   0.000000   Percentage    A       2021 
4.0          2.0         0.0        Value         A       2021
25.000000    62.500000   12.500000  Percentage    A       2020
2.0          5.0         1.0        Value         A       2020                             
               

I tried this one:

df = df.set_index('Status').T

but I didnt get output as my expected. How can I change the rest of column names?

CodePudding user response:

stack (Percentage and Value) unstack (Status):

(df.set_index(['Name', 'Tahun', 'Status'])
   .stack()
   .unstack(level='Status')
   .rename_axis(('Name', 'Tahun', 'Type'))
   .reset_index())

Status Name  Tahun        Type          X          Y     Z
0         A   2020  Percentage  25.000000  62.500000  12.5
1         A   2020       Value   2.000000   5.000000   1.0
2         A   2021  Percentage  66.666667  33.333333   0.0
3         A   2021       Value   4.000000   2.000000   0.0

CodePudding user response:

Or just use melt and pivot:

(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
   .pivot('value', ['Name', 'Tahun', 'Type'], 'Status')
   .reset_index()
   .rename_axis(columns=None))

  Name  Tahun        Type          X          Y     Z
0    A   2020  Percentage  25.000000  62.500000  12.5
1    A   2020       Value   2.000000   5.000000   1.0
2    A   2021  Percentage  66.666667  33.333333   0.0
3    A   2021       Value   4.000000   2.000000   0.0

This code melts the dataframe so that the Percentage and Value columns get merged and a new column Type get's created, then it pivots it so that the Status column values become columns.

If there are duplicates:

(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
   .pivot_table('value', ['Name', 'Tahun', 'Type'], 'Status')
   .reset_index()
   .rename_axis(columns=None))

Difference is that pivot_table has an aggfunc argument, default set to mean, so if there are duplicate values, it will find the average of the other values, whereas pivot doesn't have that argument.

  • Related