I have this following dataframe:
Status Percentage Value Name Tahun
0 X 66.666667 4.0 A 2021
1 Y 33.333333 2.0 A 2021
2 Z 0.000000 0.0 A 2021
0 X 25.000000 2.0 A 2020
1 Y 62.500000 5.0 A 2020
2 Z 12.500000 1.0 A 2020
I want to transpose the dataframe and change the column header to Status
values. Ideally the output should look like
X Y Z Type Name Tahun
66.666667 33.333333 0.000000 Percentage A 2021
4.0 2.0 0.0 Value A 2021
25.000000 62.500000 12.500000 Percentage A 2020
2.0 5.0 1.0 Value A 2020
I tried this one:
df = df.set_index('Status').T
but I didnt get output as my expected. How can I change the rest of column names?
CodePudding user response:
stack
(Percentage and Value) unstack
(Status):
(df.set_index(['Name', 'Tahun', 'Status'])
.stack()
.unstack(level='Status')
.rename_axis(('Name', 'Tahun', 'Type'))
.reset_index())
Status Name Tahun Type X Y Z
0 A 2020 Percentage 25.000000 62.500000 12.5
1 A 2020 Value 2.000000 5.000000 1.0
2 A 2021 Percentage 66.666667 33.333333 0.0
3 A 2021 Value 4.000000 2.000000 0.0
CodePudding user response:
Or just use melt
and pivot
:
(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
.pivot('value', ['Name', 'Tahun', 'Type'], 'Status')
.reset_index()
.rename_axis(columns=None))
Name Tahun Type X Y Z
0 A 2020 Percentage 25.000000 62.500000 12.5
1 A 2020 Value 2.000000 5.000000 1.0
2 A 2021 Percentage 66.666667 33.333333 0.0
3 A 2021 Value 4.000000 2.000000 0.0
This code melts the dataframe so that the Percentage
and Value
columns get merged and a new column Type
get's created, then it pivots it so that the Status
column values become columns.
If there are duplicates:
(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
.pivot_table('value', ['Name', 'Tahun', 'Type'], 'Status')
.reset_index()
.rename_axis(columns=None))
Difference is that pivot_table
has an aggfunc
argument, default set to mean
, so if there are duplicate values, it will find the average of the other values, whereas pivot
doesn't have that argument.