Loops using while in pandas-CodePudding

I am a new python user, trying trying to improve from Excel. I am mostly managing so far, but I got stuck on loops. The example I'm working on is creating a stock index (or portfolio) where weights of individual stocks would be based on their market capitalizations, but capped at a certain level. I started with these numbers:

import pandas as pd
df=pd.DataFrame(
    {
        "Ticker":[
            "AAPL","GOGL","AMZN","NVDA","MSFT","NFLX",
        ],
        "mcap":[100000,50000,10000,2000,1000,1000],
    }
)

which gives me this:

    Ticker  mcap
0   AAPL    100000
1   GOGL    50000
2   AMZN    10000
3   NVDA    2000
4   MSFT    1000
5   NFLX    1000

Next I defined the step for the optimization and the maximum weight a stock can have in index:

step=0.05
max_weight_in_index=0.3

Then I expanded the table in a way which would allow me to cap individual weights. Basically, the total value of a stock I put into the index (mcap_in_index) equals mcap * correction_factor. For now the factor is 1, meaning all stock enter the index with their full mcap. The weight of a stock in the index is calculated based on mcap_in_index. The last column (needs_correcting) basically tells me if I need to adjust the correction factor or not. If a stock's current weight in the index is above the maximum weight I defined before, it should be corrected. This is how I got to this step:

df["correction_factor"]=1
df["mcap_in_index"]=df["mcap"]*df["correction_factor"]
df["weight_in_index"]=df["mcap_in_index"]/df["mcap_in_index"].sum()
df.loc[df["weight_in_index"]<=max_weight_in_index,"needs_correcting"]=0
df.loc[df["weight_in_index"]>max_weight_in_index,"needs_correcting"]=1

Now my table looks like this (as you can see, I couldn't figure out how to display this in a more visually appealing way, sorry for that):

    Ticker  mcap    correction_factor   mcap_in_index   weight_in_index needs_correcting
0   AAPL    100000  1   100000  0.609756    1.0
1   GOGL    50000   1   50000   0.304878    1.0
2   AMZN    10000   1   10000   0.060976    0.0
3   NVDA    2000    1   2000    0.012195    0.0
4   MSFT    1000    1   1000    0.006098    0.0
5   NFLX    1000    1   1000    0.006098    0.0

Now I'm looking at the column needs_correcting. A 1 means that the current weight of the stock is more than what I allow with max_weight_in_index. The correction is done by reducing the correction factor by the step I defined earlier. Then I recalculate everything else:

df.loc[df["needs_correcting"]==1,"correction_factor"]=df["correction_factor"]-step
df["mcap_in_index"]=df["mcap"]*df["correction_factor"]
df["weight_in_index"]=df["mcap_in_index"]/df["mcap_in_index"].sum()
df.loc[df["weight_in_index"]<=max_weight_in_index,"needs_correcting"]=0
df.loc[df["weight_in_index"]>max_weight_in_index,"needs_correcting"]=1

Now my table looks like this. As you can see, it substracted the step from the correction factor in the case of AAPL and GOGL.

    Ticker  mcap    correction_factor   mcap_in_index   weight_in_index needs_correcting
0   AAPL    100000  0.95    95000.0 0.607029    1.0
1   GOGL    50000   0.95    47500.0 0.303514    1.0
2   AMZN    10000   1.00    10000.0 0.063898    0.0
3   NVDA    2000    1.00    2000.0  0.012780    0.0
4   MSFT    1000    1.00    1000.0  0.006390    0.0
5   NFLX    1000    1.00    1000.0  0.006390    0.0

What I cannot figure out is a way to loop this last step. Basically, redo the last step until all the values in needs_correcting equal 0 (their sum needs to equal 0).

"Looping" manually, I got the following table. It's what the code should do ideally. Of course, in real life the step I would use would be much smaller, so looping "manually" would take forever and makes no sense.

Ticker  mcap    correction_factor   mcap_in_index   weight_in_index needs_correcting
0   AAPL    100000  0.1 10000.0 0.294118    0.0
1   GOGL    50000   0.2 10000.0 0.294118    0.0
2   AMZN    10000   1.0 10000.0 0.294118    0.0
3   NVDA    2000    1.0 2000.0  0.058824    0.0
4   MSFT    1000    1.0 1000.0  0.029412    0.0
5   NFLX    1000    1.0 1000.0  0.029412    0.0

Sorry for the very long post, this is my first time posting here and I wasn't sure how to explain what I'm trying to do.

CodePudding user response：

while df["needs_correcting"].sum()>0:
    df.loc[df["needs_correcting"]==1,"correction_factor"]-=step
    df["mcap_in_index"]=df["mcap"]*df["correction_factor"]
    df["weight_in_index"]=df["mcap_in_index"]/df["mcap_in_index"].sum()
    df.loc[df["weight_in_index"]<=max_weight_in_index,"needs_correcting"]=0
    df.loc[df["weight_in_index"]>max_weight_in_index,"needs_correcting"]=1

df
###
  Ticker    mcap  correction_factor  mcap_in_index  weight_in_index  \
0   AAPL  100000                0.1        10000.0         0.294118   
1   GOGL   50000                0.2        10000.0         0.294118   
2   AMZN   10000                1.0        10000.0         0.294118   
3   NVDA    2000                1.0         2000.0         0.058824   
4   MSFT    1000                1.0         1000.0         0.029412   
5   NFLX    1000                1.0         1000.0         0.029412   

   needs_correcting  
0               0.0  
1               0.0  
2               0.0  
3               0.0