pandas - dealing with multilayered columns-CodePudding

I am trying to read stock quotes from Yahoo finance using a library. The returned data seems to have columns stacked over two levels:

I want to get rid of "Adj Close" and have a simple dataframe with just the ticker columns. How do I do this?

CodePudding user response：

If you run:

data[["Adj Close"]].columns

[Out]:
MultiIndex([('Adj Close', 'AAPL'),
            ('Adj Close', 'AMZN'),
            ('Adj Close', 'GOOG')],
           )

the output says that it is a multi-index.

You can use droplevel(). Drop it either using "name" or "level index".

In this case, we don't see any level names. So, use level index: 0 is for "Adj Close" and 1 is for ticker.

data_adj_close = data[["Adj Close"]].droplevel(level=0, axis="columns")
data_adj_close.columns

[Out]:
Index(['AAPL', 'AMZN', 'GOOG'], dtype='object')

Now you can select by ticker:

data_adj_close["AAPL"]

[Out]:
Date
2022-01-03    180.959732
2022-01-04    178.663071
2022-01-05    173.910660
...