Home > database >  Date filter problems (python)
Date filter problems (python)

Time:10-09

I have some problems filtering a specific period of dates from my data frame.

E.g. I'd need dates between 14/09/2014 - 30/09/2020. To get that, I used

df = df.loc[(df['date'] >= '14/09/2014') and (df['date'] <= '30/08/2020')] (value error) as well as

df = df.loc['14/09/2014':'30/08/2020'] --> which kind of worked but it also showed dates outside the specific range (e.g. 12/09/1965).

For additional information: I set up my date column ('edate') as index column, see following code:

df = pd.read_csv('https://manifestoproject.wzb.eu/down/data/2020b/datasets/MPDataset_MPDS2020b.csv', parse_dates=['date'], index_col='edate')

Also, I use dtale to show the results via localhost.

Can someone please explain why it shows further dates outside the range (when using the code with 14/09/2014:30/08/2020)? Also, I would be more than thankful if anyone could provide a solution to my problem!

Best regards

CodePudding user response:

There are 2 problems - for bitwise and is used & and for compare by datetimes is necessary format YYYY-MM-DD:

df = df[(df['date'] >= '2014-09-14') &(df['date'] <= '2020-08-30')]

Check with data:

#original data
print (len(df))
4656

#only changed and to &
df1 = df.loc[(df['date'] >= '14/09/2014') & (df['date'] <= '30/08/2020')]
print (len(df1))
4656

# changed and to & and format YYYY-MM-DD
df1 = df[(df['date'] >= '2014-09-14') &(df['date'] <= '2020-08-30')]
print (len(df1))
585
  • Related