Home > Software design >  pandas reset_index of certain level removes entire level of multiindex
pandas reset_index of certain level removes entire level of multiindex

Time:12-28

I have DataFrame like this:

                          performance
year      month     week
2015      1         2     4.170358
                    3     3.423766
                    4    -1.835888
                    5     8.157457
          2         6    -3.276887
...                            ...
2018      7         30   -1.045241
                    31   -0.870845
          8         31    0.950555
                    32    6.757876
                    33   -2.203334

I want to have week in range(0 or 1,n) where n = number of weeks in current year and month.

Well, the easy way I thought is to use

df.reset_index(level=2, drop=True)

But it's mistake I realized later, in best scenario I would get

                          performance
year      month     week
2015      1         0     4.170358
                    1     3.423766
                    2    -1.835888
                    3     8.157457
          2         4    -3.276887
...                            ...
2018      7         n-4  -1.045241
                    n-3  -0.870845
          8         n-2   0.950555
                    n-1   6.757876
                    n    -2.203334

But after I did that, I got an unexpected behaviour

                        close
timestamp timestamp
2015      1          4.170358
          1          3.423766
          1         -1.835888
          1          8.157457
          2         -3.276887
...                       ...
2018      7         -1.045241
          7         -0.870845
          8          0.950555
          8          6.757876
          8         -2.203334

I lost entire 2nd level of index! Why? I thought it will be 0 to n for each 'cluster' (Ye, it's mistake, I realized it as I mentioned above)... I solved my problem somesthing like that

df.groupby(level = [0, 1]).apply(lambda x: x.reset_index(drop=True))

And got my desired form of DataFrame like that:

                 performance
year month
2015 1     0  4.170358
           1  3.423766
           2 -1.835888
           3  8.157457
     2     0 -3.276887
...                ...
2018 7     3 -1.045241
           4 -0.870845
     8     0  0.950555
           1  6.757876
           2 -2.203334

But WHY? Why reset_index on certain level just drops it? That's the main quastion!

CodePudding user response:

reset_index with drop=True adds a default index only when you are reseting the whole index. If you're reseting just a single level of a multi-level index, it will just remove it.

  • Related