I have the following data.
value
date
2010-10-01 100
2010-10-02 200
2010-10-03 300
2012-10-02 100
2012-10-03 200
2012-10-04 300
2022-10-03 500
2022-10-04 600
I would like to filter for the 2nd day of the month for each year. So i should end up with
date value
2010-10-02 200
2012-10-03 200
2022-10-04 600
Having a difficult time figuring out the most efficient way to do this without iteration/loops or lamdas. I thought using the following should work, but does not.
import pandas as pd
from pandas.tseries.offsets import BDay, BusinessDay, BMonthBegin
df = df[df.index == df.index BMonthBegin(0) BDay(1)]
Any help here would be appreciated.
CodePudding user response:
Use DatetimeIndex.to_period
for months periods and for second days use GroupBy.cumcount
with compare second values in boolean indexing
:
df = df[df.groupby(df.index.to_period('m')).cumcount().eq(1)]
print (df)
value
date
2010-10-02 200
2012-10-03 200
2022-10-04 600