Sheet 1 has a column 'Date' with 10 years worth of dates. These dates are trading days for the Australian stockmarket. I'm looking to remove all dates that are not the 15th trading day of each month (not necessarily the 15th day of the month). This code works for the first 12 months of the first year but it stops after that.
Code:
df = pd.read_csv(r'C:\Users\\Desktop\Sheet1.csv')
df['Date'] = pd.to_datetime(df['Date'])
df['month'] = df['Date'].dt.month
df['trading_day'] = df.groupby(['month']).cumcount() 1
df = df[df['trading_day'] == 15]
df.drop(['month', 'trading_day'], axis=1, inplace=True)
df.to_excel("Sheet2.xlsx", index=False)
Current output:
Date NAV
2009-06-22 00:00:00 $50.7731
2009-07-21 00:00:00 $52.2194
2009-08-21 00:00:00 $55.5233
2009-09-21 00:00:00 $61.1116
2009-10-21 00:00:00 $62.6512
2009-11-20 00:00:00 $60.9736
2009-12-21 00:00:00 $60.2841
2010-01-22 00:00:00 $61.2418
2010-02-19 00:00:00 $59.8768
2010-03-19 00:00:00 $63.4521
2010-04-23 00:00:00 $63.1672
2010-05-21 00:00:00 $55.8651
CodePudding user response:
You also need to group by year to compute the cumcount
:
df['trading_day'] = df.groupby([df['Date'].dt.year, 'month']).cumcount() 1