I have DataFrame like this:
performance
year month week
2015 1 2 4.170358
3 3.423766
4 -1.835888
5 8.157457
2 6 -3.276887
... ...
2018 7 30 -1.045241
31 -0.870845
8 31 0.950555
32 6.757876
33 -2.203334
I want to have week in range(0 or 1,n) where n = number of weeks in current year and month.
Well, the easy way I thought is to use
df.reset_index(level=2, drop=True)
But it's mistake I realized later, in best scenario I would get
performance
year month week
2015 1 0 4.170358
1 3.423766
2 -1.835888
3 8.157457
2 4 -3.276887
... ...
2018 7 n-4 -1.045241
n-3 -0.870845
8 n-2 0.950555
n-1 6.757876
n -2.203334
But after I did that, I got an unexpected behaviour
close
timestamp timestamp
2015 1 4.170358
1 3.423766
1 -1.835888
1 8.157457
2 -3.276887
... ...
2018 7 -1.045241
7 -0.870845
8 0.950555
8 6.757876
8 -2.203334
I lost entire 2nd level of index! Why? I thought it will be 0 to n for each 'cluster' (Ye, it's mistake, I realized it as I mentioned above)... I solved my problem somesthing like that
df.groupby(level = [0, 1]).apply(lambda x: x.reset_index(drop=True))
And got my desired form of DataFrame like that:
performance
year month
2015 1 0 4.170358
1 3.423766
2 -1.835888
3 8.157457
2 0 -3.276887
... ...
2018 7 3 -1.045241
4 -0.870845
8 0 0.950555
1 6.757876
2 -2.203334
But WHY? Why reset_index on certain level just drops it? That's the main quastion!
CodePudding user response:
reset_index
with drop=True
adds a default index only when you are reseting the whole index. If you're reseting just a single level of a multi-level index, it will just remove it.