Home > Blockchain >  Selecting rows in a specific time window for corresponding to each day for a month
Selecting rows in a specific time window for corresponding to each day for a month

Time:09-07

I have the below-processed data set and I want rows corresponding to times 9.15 - 9.20 for each day for about a month. The current data that I have is of each day for a whole month at 5-minute time intervals starting from 9.15 onwards.

Below is the dictionary format and also the pic of the data

[{'datetime': '2022-01-03 09:15:00',
   'stock_code': 'NIFTY',
   'exchange_code': 'NFO',
   'product_type': 'Futures',
   'expiry_date': '27-JAN-22',
   'right': 'Others',
   'strike_price': '0',
   'open': '17443',
   'high': '17509.8496',
   'low': '17410.0508',
   'close': '17506.8496',
   'volume': '481550',
   'open_interest': '10319300',
   'count': 0},
  {'datetime': '2022-01-03 09:20:00',
   'stock_code': 'NIFTY',
   'exchange_code': 'NFO',
   'product_type': 'Futures',
   'expiry_date': '27-JAN-22',
   'right': 'Others',
   'strike_price': '0',
   'open': '17512',
   'high': '17515',
   'low': '17483.5996',
   'close': '17483.5996',
   'volume': '538550',
   'open_interest': '10208300',
   'count': 1},

enter image description here

I tried splitting the DateTime to Date and Time columns and then using .dt.date and .dt.minute to filter out the date. But it gives some error.

df['Dates'] = pd.to_datetime(df['datetime']).dt.date
df['Time'] = pd.to_datetime(df['datetime']).dt.time
df
time_mask = (df['Time'].dt.hour == 9) & ((df['Time'].dt.minute >= 15) & (df['Time'].dt.minute <= 20))

but am getting error Can only use .dt accessor with datetimelike values. I tried googling it but couldn't get the exact solution to solve it.

Thanks!!

EDIT: Now rows which have time 9.15-9.20 is getting rendered as

hour = pd.to_datetime(df['datetime']).dt.hour
minute = pd.to_datetime(df['datetime']).dt.minute
time_mask = hour.eq(9) & minute.between(15,20)

time_mask

On using the above code as pointed in the description. Now how to sort the rows based on the time_mask == true values. I tried

rslt_df = df(time_mask==True)
rslt_df

but seems df is not a callable object. Is there any other way to proceed?

CodePudding user response:

The issue is you're calling dt twice in a row. When you do df['Time'].dt.hour in your time_mask you're really doing pd.to_datetime(df['datetime']).dt.time.dt.hour because of how you defined df['Time']. Runing the dt.time.dt.hour above results in your same error.

You can directly access the hour and minute values and you don't have to save them as columns in your table. You can also use eq as a replacement for == and between to simplify your logic

hour = pd.to_datetime(df['datetime']).dt.hour
minute = pd.to_datetime(df['datetime']).dt.minute

time_mask = hour.eq(9) & minute.between(15,20)
  • Related