Home > Enterprise >  Is it possible to filter a pandas dataframe with an IntervalIndex or something similar?
Is it possible to filter a pandas dataframe with an IntervalIndex or something similar?

Time:01-18

I have a pandas dataframe that has a DatetimeIndex. I want to filter this dataframe in 15 day chunks at a time. To do this manually I can do something like:

start, end = pd.Timestamp('2022-07-01'), pd.Timestamp('2022-07-16')
filt_df = df[start: end]

However, I want to be able to do this over a long range without manually creating these filtering strings. I could just iterate over a loop and advance them all with a pd.Timedelta but it seems like I should be able to do something with the functions pandas has built-in such as pandas.interval_range or something similar. However, interval_range produces an IntervalIndex and I don't know how to use that to filter a dataframe. Is there a cleaner way to do what I'm trying to do here?

CodePudding user response:

If I am understanding your problem correctly, you could use pd.date_range() and groupby()

date_range = pd.date_range(start='2022-07-01', end='2022-07-31', freq='15D')

for start, end in zip(date_range, date_range[1:]):
    filt_df = df.groupby(pd.cut(df.index, bins=[start, end])).get_group(start)
    # Do something with filt_df

CodePudding user response:

You could use Grouper

import pandas as pd
idx = pd.date_range('2022-07-01', '2022-10-01')
df = pd.DataFrame({'idx': range(1,len(index) 1)}, index=idx)
df_chunks = [df_chunk for index, df_chunk in df.groupby(pd.Grouper(freq='15D'))] 

Out[23]: 
[            idx
 2022-07-01    1
 2022-07-02    2
 2022-07-03    3
 2022-07-04    4
 2022-07-05    5
 2022-07-06    6
 2022-07-07    7
 2022-07-08    8
 2022-07-09    9
 2022-07-10   10
 2022-07-11   11
 2022-07-12   12
 2022-07-13   13
 2022-07-14   14
 2022-07-15   15,
             idx
 2022-07-16   16
 2022-07-17   17
  • Related