I have a dataframe with stock OHLC and would like to find how many times it crosses option strikes ( a single summary statistic).
dataframe
open high low close volume datetime datetime2 n_strike strk_diff pinned_min
datetime2
2021-08-20 09:30:00-04:00 147.4400 147.5619 147.1201 147.3725 1660122.0 1629466200000 2021-08-20 13:30:00 00:00 145 2.3725 1
2021-08-20 09:31:00-04:00 147.3800 147.6600 147.1200 147.1350 430097.0 1629466260000 2021-08-20 13:31:00 00:00 145 2.1350 1
2021-08-20 09:32:00-04:00 147.1297 147.4800 147.0400 147.0550 308090.0 1629466320000 2021-08-20 13:32:00 00:00 145 2.0550 1
2021-08-20 09:33:00-04:00 147.1000 147.3199 147.0200 147.2348 285100.0 1629466380000 2021-08-20 13:33:00 00:00 145 2.2348 1
2021-08-20 09:34:00-04:00 147.2367 147.2600 146.9600 147.1250 290185.0 1629466440000 2021-08-20 13:34:00 00:00 145 2.1250 1
... ... ... ... ... ... ... ... ... ... ...
2022-07-15 15:55:00-04:00 149.8900 149.9800 149.8400 149.9550 525630.0 1657914900000 2022-07-15 19:55:00 00:00 150 0.0450 0
2022-07-15 15:56:00-04:00 149.9600 150.0000 149.9100 149.9900 675573.0 1657914960000 2022-07-15 19:56:00 00:00 150 0.0100 0
2022-07-15 15:57:00-04:00 149.9900 150.0000 149.9400 149.9900 464692.0 1657915020000 2022-07-15 19:57:00 00:00 150 0.0100 0
2022-07-15 15:58:00-04:00 149.9900 150.0500 149.9200 150.0300 753358.0 1657915080000 2022-07-15 19:58:00 00:00 150 0.0300 0
2022-07-15 15:59:00-04:00 150.0300 150.2500 149.9700 150.1700 1978823.0 1657915140000 2022-07-15 19:59:00 00:00 150 0.1700
for each row in the dataframe I want to know how many of the strike prices it crosses and thus creating a new column. my code is as follows:
#make a list of the strikes
strikes = [*range(0,(round(df_expfri['high'].max()) 5), 5)]
for row in df_temp:
H = df_temp['high']
L = df_temp['low']
count = 0
for x in strikes:
if x < L :
continue
elif x > H:
continue
elif x > L & x < H:
count =1
print (count)
and the error i'm getting is below. If I'm interpreting the error correctly; I believe my variable H and L are series' and that is what is causing my problem but am unsure of how to resolve it.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [133], in <cell line: 7>()
10 count = 0
11 for x in strikes:
---> 12 if x < L :
13 continue
14 elif x > H:
File C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py:1535, in NDFrame.__nonzero__(self)
1533 @final
1534 def __nonzero__(self):
-> 1535 raise ValueError(
1536 f"The truth value of a {type(self).__name__} is ambiguous. "
1537 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
1538 )
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
thank you in advance
CodePudding user response:
The problem is that even though you are declaring the row variable on for you are accessing it directly from the dataframe.
Replace the for line with:
for _, row in df_temp.iterrows():
And the H and L variables:
H = row['high']
L = row['low']