Home > Blockchain >  How can I detect if a value in a DataFrame crosses another value in the DataFrame
How can I detect if a value in a DataFrame crosses another value in the DataFrame

Time:10-21

This is a sample of my DataFrame which is quite large. It is candlestick OHLC data with some additional columns 'Target' etc.

                             Open      High  ...      Target  
Close Time                                   ...                     
2022-05-23 06:59:59.999  30301.46  30655.00  ...         NaN      
2022-05-23 07:59:59.999  30487.77  30670.51  ...         NaN      
2022-05-23 08:59:59.999  30457.01  30523.51  ...         NaN     
2022-05-23 09:59:59.999  30345.73  30520.00  ...    30520.00      
2022-05-23 10:59:59.999  30441.94  30519.28  ...         NaN      

How can I check if the price of this particular asset crosses over the target price after that target price was initially hit.

So the 'Target' time is 2022-05-23 09:59:59.999 and this candlestick data is in 1 hour timeframes. the price is currently moving up however say 10 hours later the price moves back down and touches the same price as the 'Target'.

There are many targets in the DataFrame and I want to be able to detect price crossing the target regardless of the direction of the price movement

How can I determine if this happens and how can I then make this particular 'Target' value NaN?

I've tried something like this but can't get it to work

for i in range(len(df['Open'])):
    if df['High'][i:] < df['Target'][i]:
        if df['High'][i:] > df['Target'][i]:
            print('Target hit')

I get ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Notice in the plotted candlestick chart the 'Target' price (blue triangle) is passed again in the future candles. This is what I am trying to detect and remove the target from df once it has been hit. enter image description here

CodePudding user response:

The following should work, assuming you also have a "Low" column.

## generate example dataframe ##
from numpy import nan
d = {'Open': {'06:59:59.999': 30301.46, '07:59:59.999': 30487.77, '08:59:59.999': 30457.01, '09:59:59.999': 30345.73, '10:59:59.999': 30441.94}, 
     'High': {'06:59:59.999': 30655.0, '07:59:59.999': 30670.51, '08:59:59.999': 30523.51, '09:59:59.999': 30520.0, '10:59:59.999': 30519.28}, 
     'Target': {'06:59:59.999': 30600.0, '07:59:59.999': 30600.0, '08:59:59.999': 30500.0, '09:59:59.999': 30521.0, '10:59:59.999': 30500.0}}
df = pd.DataFrame(d)
df["Low"] = df["Open"]

## find 'Target' values that need to be NaN'ed ##
idx_high,idx_low,idx_targ = map(df.columns.get_loc,["High","Low","Target"])
row_bool = [
            ((df.iloc[i:,idx_high] > df.iat[i,idx_targ])
                &(df.iloc[i:,idx_low] < df.iat[i,idx_targ])
            ).any() for i in range(len(df))
           ]

## set those values to NaN ##
df.loc[row_bool,"Target"] = np.nan

For the particular example I constructed, we start with

                  Open      High   Target       Low
06:59:59.999  30301.46  30655.00  30600.0  30301.46
07:59:59.999  30487.77  30670.51  30600.0  30487.77
08:59:59.999  30457.01  30523.51  30500.0  30457.01
09:59:59.999  30345.73  30520.00  30521.0  30345.73
10:59:59.999  30441.94  30519.28  30500.0  30441.94

and end up with

                  Open      High   Target       Low
06:59:59.999  30301.46  30655.00      NaN  30301.46
07:59:59.999  30487.77  30670.51      NaN  30487.77
08:59:59.999  30457.01  30523.51      NaN  30457.01
09:59:59.999  30345.73  30520.00  30521.0  30345.73
10:59:59.999  30441.94  30519.28      NaN  30441.94

I suspect that the following is a bit faster.

## find 'Target' values that need to be NaN'ed ##
row_bool = [
            ((df.loc[idx:,"High"] > row["Target"])
                &(df.loc[idx:,"Low"] < row["Target"])
            ).any() for idx,row in df.iterrows()
           ]

## set those values to NaN ##
df.loc[row_bool,"Target"] = np.nan

Interestingly (to me), this means that the whole thing can be done with a single (not very readable) line of code:

df.loc[[((df.loc[idx:,"High"] > row["Target"])&(df.loc[idx:,"Low"] < row["Target"])).any() for idx,row in df.iterrows()],"Target"] = np.nan
  • Related