Home > Enterprise >  How to replace one of the levels of a MultiIndex dataframe with one of its columns
How to replace one of the levels of a MultiIndex dataframe with one of its columns

Time:06-19

I have a dataframe such as

multiindex1 = pd.MultiIndex.from_product([['a'], np.arange(3, 8)])
df1 = pd.DataFrame(np.random.randn(5, 3), index=multiindex1)
multiindex2 = pd.MultiIndex.from_product([['s'], np.arange(1, 6)])
df2 = pd.DataFrame(np.random.randn(5, 3), index=multiindex2)
multiindex3 = pd.MultiIndex.from_product([['d'], np.arange(2, 7)])
df3 = pd.DataFrame(np.random.randn(5, 3), index=multiindex3)
df = pd.concat([df1, df2, df3])
df.index.names = ['contract', 'index']
df.columns = ['z', 'x', 'c']
>>>
                       z         x         c
contract index                              
a        3      0.354879  0.206557  0.308081
         4      0.822102 -0.425685  1.973288
         5     -0.801313 -2.101411 -0.707400
         6     -0.740651 -0.564597 -0.975532
         7     -0.310679  0.515918 -1.213565
s        1     -0.175135  0.777495  0.100466
         2      2.295485  0.381226 -0.242292
         3     -0.753414  1.172924  0.679314
         4     -0.029526 -0.020714  1.546317
         5      0.250066 -1.673020 -0.773842
d        2     -0.602578 -0.761066 -1.117238
         3     -0.935758  0.448322 -2.135439
         4      0.808704 -0.604837 -0.319351
         5      0.321139  0.584896 -0.055951
         6      0.041849 -1.660013 -2.157992

Now I want to replace the index of index with the column c. That is to say, I want the result as

                           z         x
contract c                            
a         0.308081  0.354879  0.206557
          1.973288  0.822102 -0.425685
         -0.707400 -0.801313 -2.101411
         -0.975532 -0.740651 -0.564597
         -1.213565 -0.310679  0.515918
s         0.100466 -0.175135  0.777495
         -0.242292  2.295485  0.381226
          0.679314 -0.753414  1.172924
          1.546317 -0.029526 -0.020714
         -0.773842  0.250066 -1.673020
d        -1.117238 -0.602578 -0.761066
         -2.135439 -0.935758  0.448322
         -0.319351  0.808704 -0.604837
         -0.055951  0.321139  0.584896
         -2.157992  0.041849 -1.660013

I implement it in one way

df.reset_index().set_index(['contract', 'c']).drop(['index'], axis=1)

But it seems there are some duplecate steps because I manipulate the indexs for three times. So if there is a more elegent way to achieve that?

CodePudding user response:

Try this

# convert column "c" into an index and remove "index" from index
df.set_index('c', append=True).droplevel('index')

Explanation:

Pandas' set_index method has append argument that controls whether to append columns to existing index or not; setting it True appends column "c" as an index. droplevel method removes index level (can remove column level too but removes index level by default).

  • Related