Home > Blockchain >  Calculating Higher Highs & Lower Lows in Stock Market Dataframe
Calculating Higher Highs & Lower Lows in Stock Market Dataframe

Time:06-20

I have a DataFrame containing intraday, 5 min data, for 1 month, 23 days trading days. Around 1725 rows, so 75 rows per day. There is a signal column, that gives buy/sell signal based on ST for now, but I might use another indicator later on.

PHASE 2 (in the Code added below) is where the code has to be corrected/implemented. I started working on it, but been stuck due to my lack of pandas knowledge.

  1. So I wanna create a new column, that only captures "High" column values, if the "High" value at any point of that same day is greater than the "High" value corresponding to the "buy" signal row.

  2. And also, create a new column, that only captures "Low" column values, if the "Low" value at any point of that same day is Lower than the "Low" value corresponding to "buy" signal row. Condition 2 is only checked if condition 1 is satisfied.

These 2 columns may help in analyzing the data, or for calculation. Column creation is not important but getting the necessary values for performing the calculations is necessary.

  1. Need a column that calculates returns.

    3a. If Condition 1 didn't happen, return 0.

    3b. If Condition 1 happened, but condition 2 didn't happen, return (last "Close" of day - "High" of "buy" signal row)

    3c. If Condition 1 happened, AND condition 2 happened, return ("Low" of "buy" row - "High" of "buy" row)

Adding some example scenarios below -

Here the Higher High and Lower Low both happened in same row.

return = Condition 3c, i.e. ["Low" of "buy" row - "High" of "buy" row] enter image description here

Another example below -

Here the Higher High happened first, then after a few rows, Lower Low happened.

return = Condition 3c, i.e. ["Low" of "buy" row - "High" of "buy" row] enter image description here

In cases where Lower Low doesn't happen, we take the last "Close" of that day.

Return = Condition 3b, i.e. ["Close" at 3 pm of that day - "High" of "buy" signal row] enter image description here

All of this above is for "buy" signal condition. I'll have to do it for "sell" as well. But if I can learn how to do it for one, I'll be able to do the other.

Code & Sample Data -

# -*- coding: utf-8 -*-
"""
Created on Mon Jun 15 


"""

import datetime as dt
import yfinance as yf
import pandas as pd
import numpy as np
import pandas_ta as ta
import copy


tickers = ["LT.NS","BRITANNIA.NS","CIPLA.NS","HINDALCO.NS"]

ohlc_intraday = {}

today = dt.datetime.today()
end = today - dt.timedelta(1)
start = today - dt.timedelta(20)

# looping over tickers and creating a dataframe with ohlcv, intraday tf
for ticker in tickers:
    temp = yf.download(ticker, period="1mo",interval = "5m")
    temp.dropna(how="any",inplace=True)
    # temp = temp[temp["Volume"] > 0]
    ohlc_intraday[ticker] = temp
    # ohlc_intraday[ticker].to_csv("supertrend_data.csv")

    


tickers = ohlc_intraday.keys()
# dont run above code again & again
# run below code    


# Phase 1 - Data Collection & Signal Generation
ohlc_dict = copy.deepcopy(ohlc_intraday)
tickers_signal = {}
tickers_ret = {}


for ticker in tickers:
    
    ohlc_dict[ticker]["datetime"] = pd.to_datetime(ohlc_dict[ticker].index)
    ohlc_dict[ticker]["hour"] = ohlc_dict[ticker]["datetime"].dt.hour
    ohlc_dict[ticker]["date"] = ohlc_dict[ticker]["datetime"].dt.date
    unique_days = len(ohlc_dict[ticker]["date"].unique())
    num_5min_candles = 75 # number of candles in intraday
    ohlc_dict[ticker]["candle"] = np.array(range(len(ohlc_dict[ticker])))%(num_5min_candles)   1
    
    
    # Supertrend indicator's value column
    ohlc_dict[ticker]["ST"] = ta.supertrend(high=ohlc_dict[ticker]["High"],low=ohlc_dict[ticker]["Low"],
                                  close=ohlc_dict[ticker]["Close"],length=7,
                                  multiplier=3)["SUPERT_7_3.0"]
    

    def st_levels(close,stlevel):
        if close > stlevel:
            return "buy"
        elif close < stlevel:
            return "sell"
        
    
    # capture Supertrend buy/sell signals    
    ohlc_dict[ticker]["signal"] = np.vectorize(st_levels)(ohlc_dict[ticker]["Close"],ohlc_dict[ticker]["ST"])
    
    # keep only first occurrence of signal, remove consecutive duplicated signals
    ohlc_dict[ticker].loc[ohlc_dict[ticker]["signal"].eq(ohlc_dict[ticker]["signal"].shift()), "signal"] = "no"
    
    # supertrend length; first 7 rows will contain "no" since ST period length is 7
    ohlc_dict[ticker]["signal"][:7] = "no"
    
    
    # capturing returns
    tickers_signal[ticker] = ""
    tickers_ret[ticker] = []
    # ohlc_dict[ticker].to_csv("supertrend_data.csv")


# Phase 2 - Calculating returns for intraday -
for ticker in tickers:
    print("Calculating returns for ", ticker)
    # ticker = "CIPLA.NS"
    for i in range(len(ohlc_dict[ticker])):
        if tickers_signal[ticker] == "":
            tickers_ret[ticker].append(0)
            # check if signal candle High/Low is broken for Buy/Sell Scenario respectively
            if ohlc_dict[ticker]["High"][i] >= ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0]:
                tickers_signal[ticker] = "BUY"
            elif ohlc_dict[ticker]["Low"][i] <= ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="sell"]["Low"].values[0]:    
                tickers_signal[ticker] = "SELL"
        
        elif tickers_signal[ticker] == "BUY":
            
            # calculating returns for StopLoss            
            if ohlc_dict[ticker]["Low"][i] < ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["Low"].values[0]: # and add that both candles be on same day
                tickers_signal[ticker] = ""
                tickers_ret[ticker].append(((ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["Low"].values[0] - ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0])/(ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0])) - 1)
            # calculating returns for 70th candle close in intraday
            else:
                tickers_ret[ticker].append(((ohlc_dict[ticker][ohlc_dict[ticker]['candle']==70]["Close"].values[0] - ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0])/(ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0])) - 1)
                                
        elif tickers_signal[ticker] == "SELL":
            # calculating returns for StopLoss
            if ohlc_dict[ticker]["High"][i] > ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0]:
                tickers_signal[ticker] = ""
                tickers_ret[ticker].append(((ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["Low"].values[0] - ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0])/(ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["Low"].values[0])) - 1)
            
            # calculating returns for 70th candle close in intraday
            else:
                tickers_ret[ticker].append(((ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["Low"].values[0] - ohlc_dict[ticker][ohlc_dict[ticker]['candle']==70]["Close"].values[0])/(ohlc_dict[ticker][ohlc_dict[ticker]['signal']=="buy"]["High"].values[0])) - 1)


    ohlc_dict[ticker]["ret"] = np.array(tickers_ret[ticker])
                
                

Below data is for HINDALCO.NS stock -

Datetime Open High Low Close Adj Close Volume datetime hour date candle ST signal remarks Return Calculation
2022-05-18 13:35:00 05:30 431.5499878 433.5 431.0499878 433.1000061 433.1000061 263811 2022-05-18 13:35:00 05:30 13 18-05-2022 53 427.2457627 buy
2022-05-18 13:40:00 05:30 433.2000122 436.7999878 433.1000061 435.5499878 435.5499878 667616 2022-05-18 13:40:00 05:30 13 18-05-2022 54 429.0532754 no Higher High made
2022-05-18 13:45:00 05:30 435.6000061 436.5 435.3500061 435.7000122 435.7000122 188908 2022-05-18 13:45:00 05:30 13 18-05-2022 55 430.3779006 no
2022-05-18 13:50:00 05:30 435.7000122 435.7999878 434.2999878 434.3500061 434.3500061 296853 2022-05-18 13:50:00 05:30 13 18-05-2022 56 430.3779006 no
2022-05-18 13:55:00 05:30 434.6000061 436.1499939 434.4500122 435.4500122 435.4500122 169486 2022-05-18 13:55:00 05:30 13 18-05-2022 57 430.3779006 no
2022-05-18 14:00:00 05:30 435.4500122 436.1000061 434.75 436.0499878 436.0499878 343782 2022-05-18 14:00:00 05:30 14 18-05-2022 58 430.3779006 no
2022-05-18 14:05:00 05:30 436.0499878 437.1000061 433.6499939 433.7999878 433.7999878 511022 2022-05-18 14:05:00 05:30 14 18-05-2022 59 430.3779006 no
2022-05-18 14:10:00 05:30 433.75 434 432.6499939 433.3999939 433.3999939 249525 2022-05-18 14:10:00 05:30 14 18-05-2022 60 430.3779006 no
2022-05-18 14:15:00 05:30 433.6000061 434.5 432.7999878 433.75 433.75 217272 2022-05-18 14:15:00 05:30 14 18-05-2022 61 430.3779006 no
2022-05-18 14:20:00 05:30 433.7999878 435.8500061 433.6000061 434.6499939 434.6499939 228628 2022-05-18 14:20:00 05:30 14 18-05-2022 62 430.3779006 no
2022-05-18 14:25:00 05:30 434.6499939 435.2000122 434.2000122 434.5499878 434.5499878 149256 2022-05-18 14:25:00 05:30 14 18-05-2022 63 430.3779006 no
2022-05-18 14:30:00 05:30 434.4500122 434.75 433.75 433.8999939 433.8999939 155492 2022-05-18 14:30:00 05:30 14 18-05-2022 64 430.3779006 no
2022-05-18 14:35:00 05:30 433.8999939 434.9500122 433.6499939 434.5 434.5 128515 2022-05-18 14:35:00 05:30 14 18-05-2022 65 430.3779006 no
2022-05-18 14:40:00 05:30 434.6000061 435 434.25 434.5 434.5 117541 2022-05-18 14:40:00 05:30 14 18-05-2022 66 430.3779006 no
2022-05-18 14:45:00 05:30 434.4500122 435.1000061 433.1000061 433.2999878 433.2999878 276280 2022-05-18 14:45:00 05:30 14 18-05-2022 67 430.3779006 no
2022-05-18 14:50:00 05:30 433.25 434.8999939 433.2000122 434.7000122 434.7000122 163797 2022-05-18 14:50:00 05:30 14 18-05-2022 68 430.3779006 no
2022-05-18 14:55:00 05:30 434.7000122 435.6499939 434.5499878 435.2000122 435.2000122 201351 2022-05-18 14:55:00 05:30 14 18-05-2022 69 430.5468637 no
2022-05-18 15:00:00 05:30 435.0499878 435.75 433.75 435.5 435.5 384571 2022-05-18 15:00:00 05:30 15 18-05-2022 70 430.5468637 no Lower Low not made, Exited at Market Close Return = Close (15:00 pm) - High of 'buy' signal row (highlighted in yellow above)

UPDATE - In case of "buy" signal happens, and Trade is live, waiting to exit in loss or profit, and if opposite signal "sell" appears, exit trade at that "sell" signal "Close" value.

If this is too complicated, just ignore the signal change happening on same day.

PHASE 2 is where the code has to be corrected/implemented. Checking if signal was generated, if Higher Highs / Lower Lows happened, or just exiting at Market Close (3 pm) and calculate returns accordingly.

Currently as per my code, I'm trying to take "Close" of 70th candle, which is around 15:05 pm. But please feel free to modify/improve it so the condition takes 15:00 pm Close value, either using candle count, or using datetime value.

I know this is a huge favor I'm asking, but I have been trying to get this done for over 2 weeks and couldn't find any help. Lot of tutorials and solutions are on basic, standardized methods. This is a bit unconventional, and so unable to solve it on my own.

CodePudding user response:

Here is one solution. The methodology is as follows:

  1. For each ticker in tickers (as your code includes)
  2. Group the dataframe by "date"
  3. Apply the function to this group

The calculations returned are what you specified in the conditions:

  • 3a. 0
  • 3b. Last's/Sell's close - Buy's high
  • 3c. Buy's low - Buy's high

This is different to the calculations computed in your #Phase 2 code.

The final output of the function adds the "rets" value to the "sell" or "close" row (I'm not sure exactly what you wanted to do with this, but this is consistent with your sample data remark).

def phase_2(x):
    # returns are 0 unless event occurs
    x["ret"] = 0
    
    # for each buy/sell or buy/close combination
    for i in range(sum(x["signal"] == "buy")):
        after_buy = x.loc[x[x["signal"] == "buy"].index[i]:]
        # if sell exists after buy, then range up to sell row
        if sum(after_buy["signal"] == "sell") > 0:
            after_buy = after_buy.loc[:after_buy.query("signal == 'sell'").index.min()]
        # elif hour 15 exists after buy, then range up to the first of these (15:00)
        elif len(after_buy.query("hour == 15")) > 0:
            after_buy = after_buy.loc[:after_buy.query("hour == 15").index.min()]
        # else to end of day... (this would only occur if buy after 15:00 in day)

        # check if there is a "buy" signal in that day
        if len(after_buy) > 0:
            # high (from "buy signal row")
            high = after_buy["High"].iloc[0]
            # low (from "buy signal row")
            low = after_buy["Low"].iloc[0]
            # close (last row)
            close = after_buy["Close"].iloc[-1]

            # higher high
            after_higher_high = after_buy[after_buy["High"].cummax() > high]
            # if rows exist, then condition 1 met
            if len(after_higher_high) > 0:
                # # HH value
                # higher_high = after_higher_high["High"][0]

                # lower low
                lower_low = after_higher_high[after_higher_high["Low"].cummin() < low]
                # if rows exist, then condition 2 met, return 3c.
                if len(lower_low) > 0:
                    # # LL value
                    # lower_low = lower_low["Low"][0]
                    # return 3c.
                    # print(after_buy["datetime"][0], after_buy["datetime"][-1], low - high) # line to help understand which period is covered
                    x["ret"].loc[after_buy.index[-1]] = low - high # return to "sell" or "close" row of period

                # if not, then condition 2 not met, return 3b.
                else:
                    # return 3b.
                    # print(after_buy["datetime"][0], after_buy["datetime"][-1], close - high) # line to help understand which period is covered
                    x["ret"].loc[after_buy.index[-1]] = close - high # return to "sell" or "close" row of period
            # if not, then condition 1 not met, return 3a.
            else:
                # return 3a.
                # print(after_buy["datetime"][0], after_buy["datetime"][-1], 0) # line to help understand which period is covered
                x["ret"].loc[after_buy.index[-1]] = 0 # return to "sell" or "close" row of period
    return x


# for each ticker
for ticker in tickers:
    # overwrite dataframe as new dataframe
    ohlc_dict[ticker] = ohlc_dict[ticker].groupby(["date"]).apply(lambda x: phase_2(x))

There are a few items within the function that I have left (in case you needed them for anything):

  1. The HH value in the "higher high" section, and the LL value in the "lower low" section, which from what I understood of you question are not actually used.
  2. The print statements just before the "ret" values are added to the dataframe, which you can use to understand the periods used in the function, and the values returned.

The sample data's return from this calculation is 2 (435.5 (15:00 Close) - 433.5 (Buy High)).

Hopefully the comments within the function are sufficient as an explanation, but if not let me know!

NOTE:

The code returns the whole dataframe, and then overwrites the dataframe. As the only "changed" column within the function is "rets", the last line of the function could change to return x["rets"], and the for loop for each ticker to ohlc_dict[ticker]["rets"] = ...`

  • Related