I have a pandas dataframe that has a DatetimeIndex. I want to filter this dataframe in 15 day chunks at a time. To do this manually I can do something like:
start, end = pd.Timestamp('2022-07-01'), pd.Timestamp('2022-07-16')
filt_df = df[start: end]
However, I want to be able to do this over a long range without manually creating these filtering strings. I could just iterate over a loop and advance them all with a pd.Timedelta
but it seems like I should be able to do something with the functions pandas has built-in such as pandas.interval_range
or something similar. However, interval_range
produces an IntervalIndex
and I don't know how to use that to filter a dataframe. Is there a cleaner way to do what I'm trying to do here?
CodePudding user response:
If I am understanding your problem correctly, you could use pd.date_range()
and groupby()
date_range = pd.date_range(start='2022-07-01', end='2022-07-31', freq='15D')
for start, end in zip(date_range, date_range[1:]):
filt_df = df.groupby(pd.cut(df.index, bins=[start, end])).get_group(start)
# Do something with filt_df
CodePudding user response:
You could use Grouper
import pandas as pd
idx = pd.date_range('2022-07-01', '2022-10-01')
df = pd.DataFrame({'idx': range(1,len(index) 1)}, index=idx)
df_chunks = [df_chunk for index, df_chunk in df.groupby(pd.Grouper(freq='15D'))]
Out[23]:
[ idx
2022-07-01 1
2022-07-02 2
2022-07-03 3
2022-07-04 4
2022-07-05 5
2022-07-06 6
2022-07-07 7
2022-07-08 8
2022-07-09 9
2022-07-10 10
2022-07-11 11
2022-07-12 12
2022-07-13 13
2022-07-14 14
2022-07-15 15,
idx
2022-07-16 16
2022-07-17 17