Let's say I have 2 days worth of date times:
import pandas as pd
import numpy as np
index = pd.date_range("2020-01-01 00:00:00", "2020-01-03 00:00:00", freq="15T")
print(index)
DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 00:15:00',
'2020-01-01 00:30:00', '2020-01-01 00:45:00',
'2020-01-01 01:00:00', '2020-01-01 01:15:00',
'2020-01-01 01:30:00', '2020-01-01 01:45:00',
'2020-01-01 02:00:00', '2020-01-01 02:15:00',
...
'2020-01-02 21:45:00', '2020-01-02 22:00:00',
'2020-01-02 22:15:00', '2020-01-02 22:30:00',
'2020-01-02 22:45:00', '2020-01-02 23:00:00',
'2020-01-02 23:15:00', '2020-01-02 23:30:00',
'2020-01-02 23:45:00', '2020-01-03 00:00:00'],
dtype='datetime64[ns]', length=193, freq='15T')
And then I declare a pandas array based on my index with each element set to False:
entries = pd.Series(False, index=index)
I then get the market times, and the open times:
market_hours_indices = entries.index.indexer_between_time('9:30', '16:00')
market_hours_index = entries.index[market_hours_indices]
open_hours_indices = entries.index.indexer_between_time('9:30', '10:30')
open_hours_index = entries.index[open_hours_indices]
And then mask my initial pandas array with my indexs:
entries = entries[market_hours_index]
entries[~entries.index.isin(open_hours_index)] = False
entries
This gives me:
2020-01-01 09:30:00 True
2020-01-01 09:45:00 True
2020-01-01 10:00:00 True
2020-01-01 10:15:00 True
2020-01-01 10:30:00 True
2020-01-01 10:45:00 False
2020-01-01 11:00:00 False
2020-01-01 11:15:00 False
2020-01-01 11:30:00 False
2020-01-01 11:45:00 False
2020-01-01 12:00:00 False
2020-01-01 12:15:00 False
2020-01-01 12:30:00 False
2020-01-01 12:45:00 False
2020-01-01 13:00:00 False
2020-01-01 13:15:00 False
2020-01-01 13:30:00 False
2020-01-01 13:45:00 False
2020-01-01 14:00:00 False
2020-01-01 14:15:00 False
2020-01-01 14:30:00 False
2020-01-01 14:45:00 False
2020-01-01 15:00:00 False
2020-01-01 15:15:00 False
2020-01-01 15:30:00 False
2020-01-01 15:45:00 False
2020-01-01 16:00:00 False
2020-01-02 09:30:00 True
2020-01-02 09:45:00 True
2020-01-02 10:00:00 True
2020-01-02 10:15:00 True
2020-01-02 10:30:00 True
2020-01-02 10:45:00 False
2020-01-02 11:00:00 False
2020-01-02 11:15:00 False
2020-01-02 11:30:00 False
2020-01-02 11:45:00 False
2020-01-02 12:00:00 False
2020-01-02 12:15:00 False
2020-01-02 12:30:00 False
2020-01-02 12:45:00 False
2020-01-02 13:00:00 False
2020-01-02 13:15:00 False
2020-01-02 13:30:00 False
2020-01-02 13:45:00 False
2020-01-02 14:00:00 False
2020-01-02 14:15:00 False
2020-01-02 14:30:00 False
2020-01-02 14:45:00 False
2020-01-02 15:00:00 False
2020-01-02 15:15:00 False
2020-01-02 15:30:00 False
2020-01-02 15:45:00 False
2020-01-02 16:00:00 False
dtype: bool
Which is almost what I'm aiming to do, but how do I only target the second day? So that my array looks like:
2020-01-01 09:30:00 False
2020-01-01 09:45:00 False
2020-01-01 10:00:00 False
2020-01-01 10:15:00 False
2020-01-01 10:30:00 False
2020-01-01 10:45:00 False
2020-01-01 11:00:00 False
2020-01-01 11:15:00 False
2020-01-01 11:30:00 False
2020-01-01 11:45:00 False
2020-01-01 12:00:00 False
2020-01-01 12:15:00 False
2020-01-01 12:30:00 False
2020-01-01 12:45:00 False
2020-01-01 13:00:00 False
2020-01-01 13:15:00 False
2020-01-01 13:30:00 False
2020-01-01 13:45:00 False
2020-01-01 14:00:00 False
2020-01-01 14:15:00 False
2020-01-01 14:30:00 False
2020-01-01 14:45:00 False
2020-01-01 15:00:00 False
2020-01-01 15:15:00 False
2020-01-01 15:30:00 False
2020-01-01 15:45:00 False
2020-01-01 16:00:00 False
2020-01-02 09:30:00 True
2020-01-02 09:45:00 True
2020-01-02 10:00:00 True
2020-01-02 10:15:00 True
2020-01-02 10:30:00 True
2020-01-02 10:45:00 False
2020-01-02 11:00:00 False
2020-01-02 11:15:00 False
2020-01-02 11:30:00 False
2020-01-02 11:45:00 False
2020-01-02 12:00:00 False
2020-01-02 12:15:00 False
2020-01-02 12:30:00 False
2020-01-02 12:45:00 False
2020-01-02 13:00:00 False
2020-01-02 13:15:00 False
2020-01-02 13:30:00 False
2020-01-02 13:45:00 False
2020-01-02 14:00:00 False
2020-01-02 14:15:00 False
2020-01-02 14:30:00 False
2020-01-02 14:45:00 False
2020-01-02 15:00:00 False
2020-01-02 15:15:00 False
2020-01-02 15:30:00 False
2020-01-02 15:45:00 False
2020-01-02 16:00:00 False
dtype: bool
What I've tried:
# get the days I'm interested in
df_all_days = df[(df['date'] >= previous_day) & (df['date'] <= date)]
# get the day I want to trade
df_current_day = df[df['date'] == date]
current_day_index = df_current_day.index
# create an empty numpy array
entries = pd.Series(True, index=current_day_index)
# get the index of the current day between trading hours
market_hours_indices = entries.index.indexer_between_time('9:30', '16:00')
market_hours_index = entries.index[market_hours_indices]
# return False any dates and times which fall outside of our date
df_all_days['enter'] = df_all_days['enter'][~df_all_days['enter'].index.isin(market_hours_index)] = False
This returns (everything false):
2022-06-15 07:16:00 False
2022-06-15 09:17:00 False
2022-06-15 09:18:00 False
2022-06-15 09:19:00 False
...
2022-06-16 19:59:00 False
Name: enter, dtype: bool
CodePudding user response:
I figured it out:
df = pd.read_csv(fname, index_col='datetime', usecols=["open", "high", "low", "close", "volume", "datetime", "date", "time"], parse_dates=True)
# get the days I'm interested in
df_all_days = df[(df['date'] >= previous_day) & (df['date'] <= date)]
# get the day I want to trade
df_current_day = df_all_days[df_all_days['date'] == date]
# get the hours I want to trade
df_current_day_indices = df_current_day.index.indexer_between_time('09:30', '16:00')
current_day_index = df_current_day.index[df_current_day_indices]
# create fake entry logic
df_all_days['entries'] = np.where((0), True, False)
# only enter during the days and hours we wanna trade
df_all_days['entries'] = df_all_days['entries'].index.isin(current_day_index)
with pd.option_context('display.max_rows', None, 'display.max_columns', None): # more options can be specified also
# both arrays are the same size
print(df_all_days['entries'].size)
print(df_all_days['close'].size)
print(df_all_days['entries'])