Home > database >  Filter pandas dataframe by date < another date where date is in an index
Filter pandas dataframe by date < another date where date is in an index

Time:11-17

I have a pandas dataframe (from the Binance API) indexed by symbol then date:

                    open      high       low     close        volume
symbol  date                                                            
BTCUSDT 2017-08-17   4261.48   4485.39   4200.74   4285.08    795.150377
        2017-08-18   4285.08   4371.52   3938.77   4108.37   1199.888264
        2017-08-19   4108.37   4184.69   3850.00   4139.98    381.309763
        2017-08-20   4120.98   4211.08   4032.62   4086.29    467.083022
        2017-08-21   4069.13   4119.62   3911.79   4016.00    691.743060
                      ...       ...       ...       ...           ...
        2021-11-10  66947.67  69000.00  62822.90  64882.43  65171.504050
        2021-11-11  64882.42  65600.07  64100.00  64774.26  37237.980580
        2021-11-12  64774.25  65450.70  62278.00  64122.23  44490.108160
        2021-11-13  64122.22  65000.00  63360.22  64380.00  22504.973830
        2021-11-14  64380.01  65550.51  63576.27  65519.10  25705.073470

I need to filter by date, such as date < '2021-11-11' (where '2021-11-11' is a datetime object matching the date column type).

I can get this to work via:

df = df[df['date'] < utc_datetime]

if date is not in an index.

I've investigated df.filter(), read the pandas documentation, and googled extensively (and searched SO) before posting.

If this is obvious my apologies, please just point me in the right direction.

CodePudding user response:

Use MultiIndex.get_level_values:

df = df[df.index.get_level_values('date') < utc_datetime]

Or MultiIndex.droplevel:

df = df[df.index.droplevel(0) < utc_datetime]

Or convert MultiIndex to columns, but then different index, so converting to numpy array:

df = df[df.reset_index()['date'].to_numpy() < utc_datetime]
  • Related