I am a new python user, trying trying to improve from Excel. I am mostly managing so far, but I got stuck on loops. The example I'm working on is creating a stock index (or portfolio) where weights of individual stocks would be based on their market capitalizations, but capped at a certain level. I started with these numbers:
import pandas as pd
df=pd.DataFrame(
{
"Ticker":[
"AAPL","GOGL","AMZN","NVDA","MSFT","NFLX",
],
"mcap":[100000,50000,10000,2000,1000,1000],
}
)
which gives me this:
Ticker mcap
0 AAPL 100000
1 GOGL 50000
2 AMZN 10000
3 NVDA 2000
4 MSFT 1000
5 NFLX 1000
Next I defined the step for the optimization and the maximum weight a stock can have in index:
step=0.05
max_weight_in_index=0.3
Then I expanded the table in a way which would allow me to cap individual weights. Basically, the total value of a stock I put into the index (mcap_in_index) equals mcap * correction_factor. For now the factor is 1, meaning all stock enter the index with their full mcap. The weight of a stock in the index is calculated based on mcap_in_index. The last column (needs_correcting) basically tells me if I need to adjust the correction factor or not. If a stock's current weight in the index is above the maximum weight I defined before, it should be corrected. This is how I got to this step:
df["correction_factor"]=1
df["mcap_in_index"]=df["mcap"]*df["correction_factor"]
df["weight_in_index"]=df["mcap_in_index"]/df["mcap_in_index"].sum()
df.loc[df["weight_in_index"]<=max_weight_in_index,"needs_correcting"]=0
df.loc[df["weight_in_index"]>max_weight_in_index,"needs_correcting"]=1
Now my table looks like this (as you can see, I couldn't figure out how to display this in a more visually appealing way, sorry for that):
Ticker mcap correction_factor mcap_in_index weight_in_index needs_correcting
0 AAPL 100000 1 100000 0.609756 1.0
1 GOGL 50000 1 50000 0.304878 1.0
2 AMZN 10000 1 10000 0.060976 0.0
3 NVDA 2000 1 2000 0.012195 0.0
4 MSFT 1000 1 1000 0.006098 0.0
5 NFLX 1000 1 1000 0.006098 0.0
Now I'm looking at the column needs_correcting. A 1 means that the current weight of the stock is more than what I allow with max_weight_in_index. The correction is done by reducing the correction factor by the step I defined earlier. Then I recalculate everything else:
df.loc[df["needs_correcting"]==1,"correction_factor"]=df["correction_factor"]-step
df["mcap_in_index"]=df["mcap"]*df["correction_factor"]
df["weight_in_index"]=df["mcap_in_index"]/df["mcap_in_index"].sum()
df.loc[df["weight_in_index"]<=max_weight_in_index,"needs_correcting"]=0
df.loc[df["weight_in_index"]>max_weight_in_index,"needs_correcting"]=1
Now my table looks like this. As you can see, it substracted the step from the correction factor in the case of AAPL and GOGL.
Ticker mcap correction_factor mcap_in_index weight_in_index needs_correcting
0 AAPL 100000 0.95 95000.0 0.607029 1.0
1 GOGL 50000 0.95 47500.0 0.303514 1.0
2 AMZN 10000 1.00 10000.0 0.063898 0.0
3 NVDA 2000 1.00 2000.0 0.012780 0.0
4 MSFT 1000 1.00 1000.0 0.006390 0.0
5 NFLX 1000 1.00 1000.0 0.006390 0.0
What I cannot figure out is a way to loop this last step. Basically, redo the last step until all the values in needs_correcting equal 0 (their sum needs to equal 0).
"Looping" manually, I got the following table. It's what the code should do ideally. Of course, in real life the step I would use would be much smaller, so looping "manually" would take forever and makes no sense.
Ticker mcap correction_factor mcap_in_index weight_in_index needs_correcting
0 AAPL 100000 0.1 10000.0 0.294118 0.0
1 GOGL 50000 0.2 10000.0 0.294118 0.0
2 AMZN 10000 1.0 10000.0 0.294118 0.0
3 NVDA 2000 1.0 2000.0 0.058824 0.0
4 MSFT 1000 1.0 1000.0 0.029412 0.0
5 NFLX 1000 1.0 1000.0 0.029412 0.0
Sorry for the very long post, this is my first time posting here and I wasn't sure how to explain what I'm trying to do.
CodePudding user response:
while df["needs_correcting"].sum()>0:
df.loc[df["needs_correcting"]==1,"correction_factor"]-=step
df["mcap_in_index"]=df["mcap"]*df["correction_factor"]
df["weight_in_index"]=df["mcap_in_index"]/df["mcap_in_index"].sum()
df.loc[df["weight_in_index"]<=max_weight_in_index,"needs_correcting"]=0
df.loc[df["weight_in_index"]>max_weight_in_index,"needs_correcting"]=1
df
###
Ticker mcap correction_factor mcap_in_index weight_in_index \
0 AAPL 100000 0.1 10000.0 0.294118
1 GOGL 50000 0.2 10000.0 0.294118
2 AMZN 10000 1.0 10000.0 0.294118
3 NVDA 2000 1.0 2000.0 0.058824
4 MSFT 1000 1.0 1000.0 0.029412
5 NFLX 1000 1.0 1000.0 0.029412
needs_correcting
0 0.0
1 0.0
2 0.0
3 0.0