Home > database >  Filter date range in pandas raised `UserWarning: Boolean Series key will be reindexed to match DataF
Filter date range in pandas raised `UserWarning: Boolean Series key will be reindexed to match DataF

Time:10-19

I tried to filter the dataframe by using a date range, where I have the initial start_date and the end_date is x days after the start_date. Basically, what I want is equal to the WHERE date BETWEEN start_date AND DATE_ADD(start_date, INTERVAL x DAYS) AS end_date Clause in SQL.

Here is an example of my dataframe

 ----------- ----------- 
| date      | aggregate |
 ----------- ----------- 
| ...       | ...       |
|2022-08-31 | 42        |
|2022-09-01 | 30        |
|2022-09-02 | 65        |
|2022-09-03 | 55        |
| ...       | ...       |
 ----------- ----------- 

So, I tried this on python

import pandas as pd
from datetime import datetime, timedelta

start_date = datetime.strptime("2022-08-31", "%Y-%m-%d")
end_date = start_date   timedelta(days=3) # let say I want to have 3 days range

df_filtered = df[(df['date'] >= start_date ) & (df['date'] < end_date ]

But, it raised UserWarning: Boolean Series key will be reindexed to match DataFrame index. and yielded a dataframe with missing several dates.

CodePudding user response:

how about set the date column as the index then filter:

import pandas as pd
from datetime import datetime, timedelta

df = pd.DataFrame([
    ['2022-08-31',42],
    ['2022-09-01',30],
    ['2022-09-02',65],
    ['2022-09-03',55],
],columns=['date','aggregate'])

df.date=pd.to_datetime(df['date'])
df.set_index('date',inplace=True)
start_date = datetime.strptime("2022-08-31", "%Y-%m-%d")
end_date = start_date   timedelta(days=3) # let say I want to have 3 days range
df[(df.index >= start_date ) & (df.index < end_date)]


aggregate
date    
2022-08-31  42
2022-09-01  30
2022-09-02  65
  • Related