Home > OS >  Python: How to set hierarchical columns?
Python: How to set hierarchical columns?

Time:11-09

I have the following DataFrame:

df=pd.DataFrame(index = ['2018-01-01','2018-01-02','2018-01-03','2018-01-04'])
df["ticker"] = ['TSLA', 'TSLA', 'IBM', 'IBM']
df["price"] = ['1000', '1200', '101', '108']
df["volume"] = ['100000', '123042', '1087878', '108732']
df["marketcap"] = ['1.2T', '1.4T', '30B', '35B']
df.index.rename('Date', inplace=True)
df:
                 ticker  price  volume  marketcap
          Date              
    2018-01-01   TSLA    1000   100000  1.2T
    2018-01-02   TSLA    1200   123042  1.4T
    2018-01-03   IBM     101    1087878 30B
    2018-01-04   IBM     108    108732  35B

I would like to set the tickers as columns-level = 0 and the price, volume, marketcap to columns-level = 1. I want my DataFrame to look like this:

df
                 TSLA                              IBM         
                 price  volume  marketcap          price  volume    marketcap
          Date              
    2018-01-01   1000   100000  1.2T               NaN    NaN       NaN
    2018-01-02   1200   123042  1.4T               NaN    NaN       NaN
    2018-01-03   NaN    NaN     NaN                101    1087878   30B
    2018-01-04   NaN    NaN     NaN                108    108732    35B

How can this be done using pandas?

CodePudding user response:

You can set_index and unstack with a bit of cleaning:

(df.set_index('ticker', append=True)
   .unstack('ticker')
   .swaplevel(axis=1)
   .sort_index(axis=1, level=0, sort_remaining=False)
)

output:

ticker       IBM                     TSLA                  
           price   volume marketcap price  volume marketcap
Date                                                       
2018-01-01   NaN      NaN       NaN  1000  100000      1.2T
2018-01-02   NaN      NaN       NaN  1200  123042      1.4T
2018-01-03   101  1087878       30B   NaN     NaN       NaN
2018-01-04   108   108732       35B   NaN     NaN       NaN
  • Related