Home > Enterprise >  How to access index value(s), preferably by name, in MultiIndex DataFrame?
How to access index value(s), preferably by name, in MultiIndex DataFrame?

Time:08-03

Let's say we have a MultiIndex DataFrame, and said DataFrame looks something like this:

                         0
source date  row          
Alpha  01-01 0    0.183436
             1   -0.210341
Beta   02-01 0   -0.950784
             1    0.259809

How would I go about getting the unique dates as a list in the shown order above? i.e. ['01-01', '02-01']

I know I can access an individual date with: df.loc['Alpha'].index[0][0]. This does not feel pythonic. I was hoping I could do something like df.loc['Alpha'].index['date'], but this yields IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices. This makes me wonder: why even bother naming indices in the first place? Am I using the MultiIndex DataFrame incorrectly?

Source code in case someone wants to try it out: df = pd.DataFrame(np.random.randn(4), index=pd.MultiIndex.from_tuples([('Alpha', '01-01', 0), ('Alpha', '01-01', 1), ('Beta', '02-01', 0), ('Beta', '02-01', 1)], names=['source', 'date', 'row']))

CodePudding user response:

use .index.get_level_vlaues()

df.index.get_level_values("date").unique()
#or   
df.index.get_level_values(1).unique()

 Index(['01-01', '01-01', '02-01', '02-01'], dtype='object', name='date')
  • Related