I have the below-processed data set and I want rows corresponding to times 9.15 - 9.20 for each day for about a month. The current data that I have is of each day for a whole month at 5-minute time intervals starting from 9.15 onwards.
Below is the dictionary format and also the pic of the data
[{'datetime': '2022-01-03 09:15:00',
'stock_code': 'NIFTY',
'exchange_code': 'NFO',
'product_type': 'Futures',
'expiry_date': '27-JAN-22',
'right': 'Others',
'strike_price': '0',
'open': '17443',
'high': '17509.8496',
'low': '17410.0508',
'close': '17506.8496',
'volume': '481550',
'open_interest': '10319300',
'count': 0},
{'datetime': '2022-01-03 09:20:00',
'stock_code': 'NIFTY',
'exchange_code': 'NFO',
'product_type': 'Futures',
'expiry_date': '27-JAN-22',
'right': 'Others',
'strike_price': '0',
'open': '17512',
'high': '17515',
'low': '17483.5996',
'close': '17483.5996',
'volume': '538550',
'open_interest': '10208300',
'count': 1},
I tried splitting the DateTime to Date and Time columns and then using .dt.date and .dt.minute to filter out the date. But it gives some error.
df['Dates'] = pd.to_datetime(df['datetime']).dt.date
df['Time'] = pd.to_datetime(df['datetime']).dt.time
df
time_mask = (df['Time'].dt.hour == 9) & ((df['Time'].dt.minute >= 15) & (df['Time'].dt.minute <= 20))
but am getting error Can only use .dt accessor with datetimelike values. I tried googling it but couldn't get the exact solution to solve it.
Thanks!!
EDIT: Now rows which have time 9.15-9.20 is getting rendered as
hour = pd.to_datetime(df['datetime']).dt.hour
minute = pd.to_datetime(df['datetime']).dt.minute
time_mask = hour.eq(9) & minute.between(15,20)
time_mask
On using the above code as pointed in the description. Now how to sort the rows based on the time_mask == true values. I tried
rslt_df = df(time_mask==True)
rslt_df
but seems df is not a callable object. Is there any other way to proceed?
CodePudding user response:
The issue is you're calling dt
twice in a row. When you do df['Time'].dt.hour
in your time_mask
you're really doing pd.to_datetime(df['datetime']).dt.time.dt.hour
because of how you defined df['Time']
. Runing the dt.time.dt.hour
above results in your same error.
You can directly access the hour and minute values and you don't have to save them as columns in your table. You can also use eq
as a replacement for ==
and between
to simplify your logic
hour = pd.to_datetime(df['datetime']).dt.hour
minute = pd.to_datetime(df['datetime']).dt.minute
time_mask = hour.eq(9) & minute.between(15,20)