Home > database >  Unpivotting pandas dataframe with multi index columns
Unpivotting pandas dataframe with multi index columns

Time:04-05

I have a Dataframe that has a multi level column index. Please see the code below to construct the DataFrame. I have replaced all the data with random numbers to prevent sensitive information being leaked.

import pandas as pd
import numpy as np
import random

pd.options.display.max_columns=None

multi_index = [('Part Information', 'Brand'),
 ('Part Information', 'Model'),
 ('Part Information', 'Part Grouping'),
 ('Part Information', 'Part Desc'),
 ('Part Information', 'VWG Part Number(s)'),
 ('Part Information', 'VWG Retail Price'),
 ('Part Information', 'Trade Discount'),
 ('Part Information', 'VWG Trade Price'),
 ('Part Information', 'Is VWG Part Common or Unique?'),
 ('Part Information', 'VWG Volume'),
 ('Part Information', 'Competitor'),
 ('Q1 2018', 'Competitor Part Number'),
 ('Q1 2018', 'Competitor Brand'),
 ('Q1 2018', 'Competitor Retail Price'),
 ('Q1 2018', 'Competitor Trade Price'),
 ('Q2 2018', 'Competitor Part Number'),
 ('Q2 2018', 'Competitor Brand'),
 ('Q2 2018', 'Competitor Retail Price'),
 ('Q2 2018', 'Competitor Trade Price'),
 ('Q3 2018', 'Competitor Part Number'),
 ('Q3 2018', 'Competitor Brand'),
 ('Q3 2018', 'Competitor Retail Price'),
 ('Q3 2018', 'Competitor Trade Price'),
 ('Q4 2018', 'Competitor Part Number'),
 ('Q4 2018', 'Competitor Brand'),
 ('Q4 2018', 'Competitor Retail Price'),
 ('Q4 2018', 'Competitor Trade Price'),
 ('Q2 2019', 'Competitor Part Number'),
 ('Q2 2019', 'Competitor Brand'),
 ('Q2 2019', 'Competitor Retail Price'),
 ('Q2 2019', 'Competitor Trade Price'),
 ('Q3 2019', 'Competitor Part Number'),
 ('Q3 2019', 'Competitor Brand'),
 ('Q3 2019', 'Competitor Retail Price'),
 ('Q3 2019', 'Competitor Trade Price'),
 ('Q4 2019', 'Competitor Part Number'),
 ('Q4 2019', 'Competitor Brand'),
 ('Q4 2019', 'Competitor Retail Price'),
 ('Q4 2019', 'Competitor Trade Price')]


pd.MultiIndex.from_tuples(multi_index)

    
df = pd.DataFrame(np.random.randn(10, 39))
df.columns = pd.MultiIndex.from_tuples(multi_index)

The output dataframe is shown in the screenshots below (apologies for the multiple screenshots).

enter image description here

enter image description here

As you can see there are are repeated columns that I would like to unpivot. Furthermore, I would like to add in a 'QUARTER' column instead of having a multi index column structure. So I would like the dataframe to be manipulated as follows:

cols = ['Brand', 'Model', 'Part Grouping', 'Part Desc', 'VWG Part Number(s)',
       'VWG Retail Price', 'Trade Discount', 'VWG Trade Price',
       'Is VWG Part Common or Unique?', 'VWG Volume', 'Competitor', 'QUARTER',
       'Competitor Part Number', 'Competitor Brand', 'Competitor Retail Price',
       'Competitor Trade Price']

df_new = pd.DataFrame(np.random.randn(10, 16), columns=cols)

df_new.loc[0:4, 'QUARTER'] = 'Q1 2018'
df_new.loc[4:8, 'QUARTER'] = 'Q2 2018'
df_new.loc[9, 'QUARTER'] = '...'

enter image description here

enter image description here

How to I unpivot the column categories in Level 0 of the column hierarchy??? Do I use pd.melt() or pd.stack()/unstack()

I would greatly appreciate any help and if you need more information let me know.

CodePudding user response:

IIUC, you can split the DataFrame in two, unstack the right part and join:

(df['Part Information']
 .join(df.drop(columns='Part Information', level=0)
         .stack(0)
         .rename_axis((None, 'Quarter'))
         .reset_index(1))
)

output:

      Brand     Model  Part Grouping  Part Desc  VWG Part Number(s)  VWG Retail Price  Trade Discount  VWG Trade Price  Is VWG Part Common or Unique?  VWG Volume  Competitor  Quarter  Competitor Brand  Competitor Part Number  Competitor Retail Price  Competitor Trade Price
0  1.163696  0.789552      -1.673217  -0.256159            0.299669         -1.918318        1.741297        -0.005605                       1.085802   -0.775250   -0.800543  Q1 2018          0.668761               -0.266060                -1.018759               -0.755990
0  1.163696  0.789552      -1.673217  -0.256159            0.299669         -1.918318        1.741297        -0.005605                       1.085802   -0.775250   -0.800543  Q2 2018          1.386664               -1.832704                 1.325866               -0.123179
0  1.163696  0.789552      -1.673217  -0.256159            0.299669         -1.918318        1.741297        -0.005605                       1.085802   -0.775250   -0.800543  Q2 2019         -0.612474               -0.250223                -1.299746               -0.870354
0  1.163696  0.789552      -1.673217  -0.256159            0.299669         -1.918318        1.741297        -0.005605                       1.085802   -0.775250   -0.800543  Q3 2018         -1.553103                1.462980                 1.578326                0.417629
...
  • Related