Home > Blockchain >  Pandas Multi Index Division
Pandas Multi Index Division

Time:02-11

I have a Multi Index dataframe that looks like

                            Mid
Strike  Expiration  Symbol  
167.5   2022-02-11  AAPL170 5.4
170     2022-02-11  AAPL170 3.1
        2022-02-18  AAPL170 4.525
        2022-02-25  AAPL170 5.25
        2022-03-04  AAPL170 6.00
172.5   2022-02-11  AAPL172 1.265
        2022-02-18  AAPL172 2.91
175     2022-02-11  AAPL175 0.265
        2022-02-18  AAPL175 1.695

so it is a multi index with the index of strike expiration and symbol, and then mid is just a column name, and NOT an index. I have a few other columns but they are not important for now. These are sorted in increasing value from the first index, I'm hoping to divide each Mid row by the next row below it, but only within each individual strike index. Currently, I am doing

df['ratio'] = (df['mid'] / df['mid'].shift(-1))

and it works to give me a new column of all the divisions, but I'm running into problems where, for example, the 167.5 2022-02-11 row is getting divided by the 170 2022-02-11 row, and I need those to remain separated by index.

My goal is after this division is done, to be able to search for any ratios that are above a cutoff, ex. 0.5, and output what was divided to a new dataframe, so something similar to

Strike  Expirations                Symbol    Ratio
170     2022-02-11 / 2022-02-18    AAPL170   0.685

If someone could advise on both parts, I'd greatly appreciate it but I primarily need the first part fixed.

CodePudding user response:

One way using pandas.DataFrame.groupby with pct_change:

new_df = df.groupby(level=0).pct_change(-1)   1
print(new_df)

Output:

                                Mid
Strike Expiration Symbol           
167.5  2022-02-11 AAPL170       NaN
170.0  2022-02-11 AAPL170  0.685083
       2022-02-18 AAPL170  0.861905
       2022-02-25 AAPL170  0.875000
       2022-03-04 AAPL170       NaN
172.5  2022-02-11 AAPL172  0.434708
       2022-02-18 AAPL172       NaN
175.0  2022-02-11 AAPL175  0.156342
       2022-02-18 AAPL175       NaN

For debugging, you can try something like:

for _, d in df.groupby(level=0):
    try:
        d.pct_change()
    except:
        print(d)
        break
  • Related