Home > Enterprise >  Sorting data frame by time period; datetime64[ns]
Sorting data frame by time period; datetime64[ns]

Time:12-02

I have another issue with summing up a column (Python - Pandas)

I have a data frame "new" with Dates from a periods of 5 days. The 'Dates' column is the type datetime64[ns]. I try to filter the data frame by date, for example "all values between 2021-10-10 and 2021-10-15" or "all values after 2021-10-14" etc. No matter what I try, I get error messages. Starting with:

mask = (new['Date'] > '2021-10-10') & (df['Date'] <= '2021-10-15')

I get:

TypeError: '<=' not supported between instances of 'date time.date' and 'str'

After this error I try to transform the slices, following the advice "The thing is you want to slice using Strings '2017-07-07' while your index is of type date time.date. Your slices should be of this type too. You can do this by defining your start date and end date as follows:

import pandas as pd
startdate = pd. to_datetime("2017-7-7").date()
enddate = pd. to_datetime("2017-7-10").date()
df.loc[startdate:enddate]

(I remove the spaces of course) But now I get

TypeError: '<' not supported between instances of 'int' and 'date time.date'

I just want to sort and filter my data frame by different time periods. Thanks for any help

CodePudding user response:

Just to ensure everything is in the same format use pd.to_datetime() and using infer_datetime_format=True helps with the formatting and speeds up the function too:

df['Date'] = pd.to_datetime(df['Date'],infer_datetime_format=True)

df = df[(df['Date'] > pd.to_datetime('2021-10-10')) & (df['Date'] <= pd.to_datetime('2021-10-15'))]
  • Related