I have an index that looks like this:
DatetimeIndex(['2005-01-03', '2005-01-04', '2005-01-05', '2005-01-07',
'2005-01-10', '2005-01-11', '2005-01-12', '2005-01-13',
'2005-01-14', '2005-01-17',
...
'2021-11-18', '2021-11-19', '2021-11-22', '2021-11-23',
'2021-11-24', '2021-11-25', '2021-11-26', '2021-11-29',
'2021-11-30', '2021-12-01'],
What is the best way to slice day n for each month in the index? If day n doesn't exist for a month the next day in index should be included.
CodePudding user response:
IIUC, use resample
after compute the difference between each day and the day N:
Setup a MRE:
dti = pd.date_range('2021', '2022', freq='6D')
df = pd.DataFrame({'A': np.random.random(len(dti))}, index=dti)
print(df)
# Output:
A
2021-01-01 0.771107
2021-01-07 0.513512
2021-01-13 0.613989
2021-01-19 0.842237
2021-01-25 0.056986
... ...
2021-12-03 0.002022
2021-12-09 0.943162
2021-12-15 0.665152
2021-12-21 0.350673
2021-12-27 0.109381
[61 rows x 1 columns]
Select right rows
# N = 12
>>> df.loc[df.assign(diff=abs(N - df.index.day)).resample('M')['diff'].idxmin().values]
A
2021-01-13 0.613989 # nearest day
2021-02-12 0.847639 # day N
2021-03-14 0.356432 # nearest day
2021-04-13 0.279536 # nearest day
2021-05-13 0.628562 # nearest day
2021-06-12 0.859800 # day N
2021-07-12 0.112848 # day N
2021-08-11 0.933266 # nearest day
2021-09-10 0.484271 # nearest day
2021-10-10 0.289343 # nearest day
2021-11-09 0.963057 # nearest day
2021-12-09 0.943162 # nearest day