Smart for loop in python for a portfolio performance-CodePudding

this is my first question here, so go easy on me.

I've computed a certain portfolio in python, for which I've gotten a dataframe (or list for that matter) of around 3000 daily returns in the format of

2006-12-19  -0.005811
2006-12-20  0.003131
2006-12-21  0.002231
2006-12-22  -0.005467
2006-12-27  0.008004
... ...
2022-03-25  -0.003178
2022-03-28  0.003613
2022-03-29  0.016891
2022-03-30  -0.007323
2022-03-31  -0.007277

My problem now is that I would like to statistically analyze the performance of it, so I need to start at various dates (to analyze certain periods) with 100 (percent) and then aggregate these returns to it. I tried for loops, but I got weird results, and was hoping to get a hint on here or an explanation to how to solve this.

My solution so far was this:

startvalue = 100
for row in TVWR:
    startvalue = startvalue*(1 row)

with this, I get the result of 475.8556658351451, which seems a little high. Additionally, I would like to get a result as a Dataframe looking something like this:

2006-12-19  99.994
2006-12-20  100.012
2006-12-21  100.432
2006-12-22  100.024
2006-12-27  100.895

Thank you for your kind help!

CodePudding user response：

First things first, I know of a proverb that says "If you loop in pandas, you're doing something wrong". Most of the time it's true.

In your case, you can use apply or transform to modify the series (I'll be guessing names since those are omitted):

df["values"] = df["values"].apply(lambda x: START_VALUE*(1 x))

df["values"] = df["values"].transform(lambda x: START_VALUE*(1 x))

Here, df is your DataFrame and values is the name of the column containing the percentages (-0.005811, 0.003131 etc...). START_VALUE is your startvalue variable, I am creating it in all caps to signify it should be treated as a constant.

Your example fails because you keep modifying startvariable :

for row in TVWR:
    startvalue = startvalue*(1 row)
    # this means that startvalue isn't 100 anymore !

CodePudding user response：

I was finally able to solve my issue:

THREE_DAILY_RETURNS is my DataFrame with three columns of daily returns

THREE_DAILY_RETURNS_CUM = (1   THREE_DAILY_RETURNS).cumprod() - 1
THREE_DAILY_RETURNS_CUM = THREE_DAILY_RETURNS_CUM   1
THREE_DAILY_RETURNS_CUM = THREE_DAILY_RETURNS_CUM * 100
THREE_DAILY_RETURNS_CUM

This gives me the desired result:


Total   SMIC    MeanReturn
Date            
2006-12-19  99.418891   99.306143   99.314898
2006-12-20  99.730204   99.393379   99.765376
2006-12-21  99.952733   99.446009   100.088569
2006-12-22  99.406320   98.893548   99.534840
2006-12-27  100.201991  99.976018   100.337066
... ... ... ...
2022-03-25  383.051964  222.143133  379.964060
2022-03-28  384.435785  223.113019  381.076173
2022-03-29  390.929235  226.226521  388.694509
2022-03-30  388.066547  224.724212  385.177961
2022-03-31  385.242695  223.304389  382.313810

Probably not the most elegant solution, but it worked for me.