I have a Pandas dataframe with 3 columns: Date, Return, and Result that has stock gains % in the Return column. The dates are set to daily and the index is reset.
The dataframe has hundreds of rows. I'm trying to fill in the current value of a row by using the previous value in the results column. I can do this manually like this:
df["Result"].iloc[0] = 100 * (1 df["Return"].iloc[0])
df["Result"].iloc[1] = df["Result"].iloc[0] * (1 df["Return"].iloc[1])
The problem is when I try to make this into a function and I've also tried using a lambda function with no results. Other resources oversimplify the examples. Can anyone help here?
This is one of several different iterations of an attempted function that didn't work.
def result_calc(df, i):
df["Result"].iloc[i] = df["Result"].iloc[i-1] * (1 df["Return"].iloc[i])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
\<ipython-input-410-c6a07bcb23a7\> in \<module\>
1 df1 = buy_hold_df.copy()
\----\> 2 df1.iloc\[1:\]\["Result"\] = df1.iloc\[1:\]\["Result"\].apply(result_calc(df1, 1))
\~\\Anaconda3\\envs\\FFN\\lib\\site-packages\\pandas\\core\\series.py in apply(self, func, convert_dtype, args, \*\*kwds)
4106 else:
4107 values = self.astype(object).\_values
\-\> 4108 mapped = lib.map_infer(values, f, convert=convert_dtype)
4109
4110 if len(mapped) and isinstance(mapped\[0\], Series):
pandas_libs\\lib.pyx in pandas.\_libs.lib.map_infer()
TypeError: 'NoneType' object is not callable
CodePudding user response:
IIUC:
# pseudo code
result[0] = 100 * (1 return[0])
result[1] = result[0] * (1 return[1]) = 100 * (1 return[0]) * (1 return[1])
...
return[n] = result[n-1] * (1 return[n])
= 100 * (1 return[0]) * (1 return[1]) * ... * (1 return[n])
= 100 * cumprod(1 return)
So:
>>> import pandas as pd
>>>
>>> df = pd.DataFrame(dict(
... Date=["2019-01-02", "2019-01-03", "2019-01-04", "2019-01-07", "2019-01-08"],
... Return=[.035039, .001354, .025693, .018128, .012625]
... ))
>>> df["Result"] = 100 * (df.Return 1).cumprod()
>>> df
Date Return Result
0 2019-01-02 0.035039 103.503900
1 2019-01-03 0.001354 103.644044
2 2019-01-04 0.025693 106.306971
3 2019-01-07 0.018128 108.234103
4 2019-01-08 0.012625 109.600559