This is a sample of my DataFrame which is quite large. It is candlestick OHLC data with some additional columns 'Target' etc.
Open High ... Target
Close Time ...
2022-05-23 06:59:59.999 30301.46 30655.00 ... NaN
2022-05-23 07:59:59.999 30487.77 30670.51 ... NaN
2022-05-23 08:59:59.999 30457.01 30523.51 ... NaN
2022-05-23 09:59:59.999 30345.73 30520.00 ... 30520.00
2022-05-23 10:59:59.999 30441.94 30519.28 ... NaN
How can I check if the price of this particular asset crosses over the target price after that target price was initially hit.
So the 'Target' time is 2022-05-23 09:59:59.999 and this candlestick data is in 1 hour timeframes. the price is currently moving up however say 10 hours later the price moves back down and touches the same price as the 'Target'.
There are many targets in the DataFrame and I want to be able to detect price crossing the target regardless of the direction of the price movement
How can I determine if this happens and how can I then make this particular 'Target' value NaN?
I've tried something like this but can't get it to work
for i in range(len(df['Open'])):
if df['High'][i:] < df['Target'][i]:
if df['High'][i:] > df['Target'][i]:
print('Target hit')
I get ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Notice in the plotted candlestick chart the 'Target' price (blue triangle) is passed again in the future candles. This is what I am trying to detect and remove the target from df once it has been hit.
CodePudding user response:
The following should work, assuming you also have a "Low" column.
## generate example dataframe ##
from numpy import nan
d = {'Open': {'06:59:59.999': 30301.46, '07:59:59.999': 30487.77, '08:59:59.999': 30457.01, '09:59:59.999': 30345.73, '10:59:59.999': 30441.94},
'High': {'06:59:59.999': 30655.0, '07:59:59.999': 30670.51, '08:59:59.999': 30523.51, '09:59:59.999': 30520.0, '10:59:59.999': 30519.28},
'Target': {'06:59:59.999': 30600.0, '07:59:59.999': 30600.0, '08:59:59.999': 30500.0, '09:59:59.999': 30521.0, '10:59:59.999': 30500.0}}
df = pd.DataFrame(d)
df["Low"] = df["Open"]
## find 'Target' values that need to be NaN'ed ##
idx_high,idx_low,idx_targ = map(df.columns.get_loc,["High","Low","Target"])
row_bool = [
((df.iloc[i:,idx_high] > df.iat[i,idx_targ])
&(df.iloc[i:,idx_low] < df.iat[i,idx_targ])
).any() for i in range(len(df))
]
## set those values to NaN ##
df.loc[row_bool,"Target"] = np.nan
For the particular example I constructed, we start with
Open High Target Low
06:59:59.999 30301.46 30655.00 30600.0 30301.46
07:59:59.999 30487.77 30670.51 30600.0 30487.77
08:59:59.999 30457.01 30523.51 30500.0 30457.01
09:59:59.999 30345.73 30520.00 30521.0 30345.73
10:59:59.999 30441.94 30519.28 30500.0 30441.94
and end up with
Open High Target Low
06:59:59.999 30301.46 30655.00 NaN 30301.46
07:59:59.999 30487.77 30670.51 NaN 30487.77
08:59:59.999 30457.01 30523.51 NaN 30457.01
09:59:59.999 30345.73 30520.00 30521.0 30345.73
10:59:59.999 30441.94 30519.28 NaN 30441.94
I suspect that the following is a bit faster.
## find 'Target' values that need to be NaN'ed ##
row_bool = [
((df.loc[idx:,"High"] > row["Target"])
&(df.loc[idx:,"Low"] < row["Target"])
).any() for idx,row in df.iterrows()
]
## set those values to NaN ##
df.loc[row_bool,"Target"] = np.nan
Interestingly (to me), this means that the whole thing can be done with a single (not very readable) line of code:
df.loc[[((df.loc[idx:,"High"] > row["Target"])&(df.loc[idx:,"Low"] < row["Target"])).any() for idx,row in df.iterrows()],"Target"] = np.nan