I have a Multi Index dataframe that looks like
Mid
Strike Expiration Symbol
167.5 2022-02-11 AAPL170 5.4
170 2022-02-11 AAPL170 3.1
2022-02-18 AAPL170 4.525
2022-02-25 AAPL170 5.25
2022-03-04 AAPL170 6.00
172.5 2022-02-11 AAPL172 1.265
2022-02-18 AAPL172 2.91
175 2022-02-11 AAPL175 0.265
2022-02-18 AAPL175 1.695
so it is a multi index with the index of strike
expiration
and symbol
, and then mid
is just a column name, and NOT an index. I have a few other columns but they are not important for now. These are sorted in increasing value from the first index, I'm hoping to divide each Mid
row by the next row below it, but only within each individual strike
index. Currently, I am doing
df['ratio'] = (df['mid'] / df['mid'].shift(-1))
and it works to give me a new column of all the divisions, but I'm running into problems where, for example, the 167.5 2022-02-11
row is getting divided by the 170 2022-02-11
row, and I need those to remain separated by index.
My goal is after this division is done, to be able to search for any ratios that are above a cutoff, ex. 0.5, and output what was divided to a new dataframe, so something similar to
Strike Expirations Symbol Ratio
170 2022-02-11 / 2022-02-18 AAPL170 0.685
If someone could advise on both parts, I'd greatly appreciate it but I primarily need the first part fixed.
CodePudding user response:
One way using pandas.DataFrame.groupby
with pct_change
:
new_df = df.groupby(level=0).pct_change(-1) 1
print(new_df)
Output:
Mid
Strike Expiration Symbol
167.5 2022-02-11 AAPL170 NaN
170.0 2022-02-11 AAPL170 0.685083
2022-02-18 AAPL170 0.861905
2022-02-25 AAPL170 0.875000
2022-03-04 AAPL170 NaN
172.5 2022-02-11 AAPL172 0.434708
2022-02-18 AAPL172 NaN
175.0 2022-02-11 AAPL175 0.156342
2022-02-18 AAPL175 NaN
For debugging, you can try something like:
for _, d in df.groupby(level=0):
try:
d.pct_change()
except:
print(d)
break