Home > Net >  How to pivot a column into columns while the rest of the column become nested column under the pivot
How to pivot a column into columns while the rest of the column become nested column under the pivot

Time:01-21

I have a data look like this:

    year     energy_products  consumption_ktoe    value_ktoe
0   2009       Coal and Peat               3.0      3.300000
1   2009           Crude Oil               0.0  49079.900000
2   2009         Electricity            3338.1   3594.203691
3   2009         Natural Gas             867.8   6656.700000
4   2009              Others               0.0      0.000000
..  ...            .......           .......       .........

I want to pivot energy_product into columns and consumption_ktoe and value_ktoe as sub column under every pivoted columns.

So my desire output is something like this:

 energy_products  Coal and Peat                 Crude Oil                     \
    year          consumption_ktoe  value_ktoe  consumption_ktoe  value_ktoe
0   2009                       3.0    3.300000                 3           4


energy_products   Electricity                   Natural Gas                   \
    year          consumption_ktoe  value_ktoe  consumption_ktoe  value_ktoe  
0   2009                       3.0    3.300000                 3           4

energy_products   Others
    year          consumption_ktoe  value_ktoe  
0   2009                       3.0    3.300000 

After I pivot it, I get this:

finalConImportMerge = finalConImportMerge.pivot(index='year', columns=['energy_products'])
                consumption_ktoe                                           \
energy_products    Coal and Peat Crude Oil Electricity Natural Gas Others   
year                                                                        
2009                         3.0       0.0      3338.1       867.8    0.0   
2010                         5.9       0.0      3633.1      1128.3    0.0   
2011                         5.7       0.0      3697.9      1184.0    0.0   
2012                        23.0       0.0      3800.4      1126.2    0.0   
2013                       129.0       0.0      3864.8      1311.6    0.0   
2014                       162.2       0.0      3989.9      1355.1    0.0   
2015                       162.9       0.0      4085.5      1340.5    0.0   
2016                       166.8       0.0      4180.9      1360.3    0.0   
2017                       600.5       0.0      4268.1      1438.4    0.0   
2018                       182.5       0.0      4338.8      1479.4    0.0   
2019                       180.9       0.0      4448.0      1482.4    0.0   
2020                       182.7       0.0      4366.2      1638.2    0.0   

                                      value_ktoe                         \
energy_products Petroleum Products Coal and Peat Crude Oil  Electricity   
year                                                                      
2009                        6474.8           3.3   49079.9  3594.203691   
2010                        7790.2           6.0   46074.3  3900.815819   
2011                        7614.0           6.1   47087.8  3955.235409   
2012                        7946.7          25.0   52191.4  4035.785689   
2013                        8475.0         266.4   48759.6  4124.117566   
2014                        8968.3         396.2   49155.7  4239.865600   
2015                        9993.3         406.1   50008.5  4322.578390   
2016                        9351.2         426.6   53410.0  4435.648008   
2017                        9149.6         898.6   58554.5  4490.609300   
2018                        9073.5         477.9   57180.3  4548.992776   
2019                       10125.2         458.0   54399.7  4655.398594   
2020                        8713.4         433.2   47047.3  4563.333718   

I tried swapping their level

finalConImportMerge = finalConImportMerge.swaplevel(axis=1)

but end up something like this:

energy_products    Coal and Peat        Crude Oil ...... Coal and Peat   Crude Oil .......
year            consumption_ktoe  consumption_ktoe  ......   value_ktoe  value_ktoe .....
2009                           3                 3                    3           3

Then I thought of combining the top level column, but if I use groupby, I have to sum it, and the consumption col and value col will be summed, which is not I want.

finalConImportMerge = finalConImportMerge.groupby(level=[0], axis=1)

I end up with this:

energy_products Coal and Peat   Crude Oil   Electricity Natural Gas Others  \
year                        
2009                      6.3     49079.9   6932.303691      7524.5    0.0  
....                     ....         ....              ....            ...         ..
energy_products     Petroleum Products
year                        
2009                           96774.9 
....                             ....

Is there any way to achieve my desire result? Or group the column without summing them?

CodePudding user response:

You were almost there. It seems a bit hacky, but I'm not aware of another way to do it besides this. (I use the example from pandas pivot documentation because you didn't provide code to recreate your sample DataFrame.)

import pandas as pd

df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
                           'two'],
                   'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
                   'baz': [1, 2, 3, 4, 5, 6],
                   'zoo': ['x', 'y', 'z', 'q', 'w', 't']})

pivoted_df = df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
goal_df = pivoted_df.swaplevel(axis=1)[sorted(pivoted_df.swaplevel(axis=1))]

goal_df looks like this

bar A       B       C
    baz zoo baz zoo baz zoo
foo                     
one 1   x   2   y   3   z
two 4   q   5   w   6   t

CodePudding user response:

How about pivot,reset, unstack, swap,reset, drop ,set_index , Transpose and reset

  finalConImportMerge.pivot(index=['year'], columns=['energy_products']).reset_index().unstack().swaplevel(1,0).reset_index().drop('level_2',axis=1).set_index('energy_products').T.reset_index(drop=True)
Out[47]: 
energy_products             Coal_and_Peat         Crude_Oil       Electricity       Natural_Gas            Others Coal_and_Peat   Crude_Oil  Electricity Natural_Gas      Others
0                  year  consumption_ktoe  consumption_ktoe  consumption_ktoe  consumption_ktoe  consumption_ktoe    value_ktoe  value_ktoe   value_ktoe  value_ktoe  value_ktoe
1                2009.0               3.0               0.0            3338.1             867.8               0.0           3.3     49079.9  3594.203691      6656.7         0.0
  • Related