I am trying to write a code that will compute Sharpe ratio for a list of stocks for every calendar quarter. Here is a sample of my dataframe (df):
So, firstly, I wrote a function that calculates a Sharpe ratio:
def compute_sr(df, risk_free_rate=0):
mean_return = df.mean()
std = df.std()
sharpe_ratio = (mean_return - risk_free_rate)/std
# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)
Then, I wrote a code that calculates a Sharpe ratio for every stock that includes in a list of stocks, based on calendar quarter, and adding the result in a new column:
start = '2018-01-01'
end = '2021-12-21'
list_of_stocks = ['AAPL', 'MSFT', 'FB']
date_range = pd.date_range(start=start, end=end)
for stock in list_of_stocks:
df['Sharpe Ratio'] = df[df['Symbol'] == stock]['Log Return'][compute_sr(df.loc[date_range - pd.offsets.DateOffset(months=3): date_range, 'Log Return'])
for date_range in df.index]
Unfortunately, I got the following syntax error:
File "<ipython-input-211-a78d09294412>", line 3
for date_range in trading_data.index]
^
SyntaxError: invalid syntax
What is the reason for the syntax error?
Thanks in advance.
CodePudding user response:
I think you have three problems here: what you think you are doing with dates, your compute_sr
function and this messy thing inside your stocks loop.
First, the dates:
When you do date_range = pd.date_range(start=start, end=end)
, you get and array. So, you won't be able to select data with .loc
and get the dateframe you want to pass as an argument to your compute_sr
function.
You have:
df.loc[an_array: another_array, 'Log Return']
Pandas can't figure out where to start and where to end the index of the dataframe to be returned. You neeed:
df.loc[a_start_date:an_end_date, 'Log Return']
But let's assume you will get a dataframe, that you would pass to compute_sr
. Here's what it is doing:
df = pd.DataFrame({'Date':['2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09','2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09'], 'Symbol':['AAPL','AAPL','AAPL','AAPL','AAPL','GOOGL''GOOGL','GOOGL','GOOGL','GOOGL'], 'AdjClose':[41.12,41.31,41.79,41.63,41.63,2265.89,2465.89,2625.89,2565.89,2165.89],'LogReturn':[-0.000175,0.004634,0.011321,-0.003721,-0.000115,-0.003452,0.004111,0.032111,0.003721,-0.000115]})
# calculate_sr
mean_return = df[df['Symbol'] == stock]['LogReturn'].mean()
std = df[df['Symbol'] == stock]['LogReturn'].std()
sharpe_ratio = (mean_return - 0)/std
print(sharpe_ratio)
0.4111951194083935
So, that messy line is in fact:
df1 = df[df['Symbol'] == stock]['Log Return']
# compute_sr will resolve to `sharpe_ratio`
sharpe_ratio = 0.4111951194083935
df1[sharpe_ratio]
This will return a type error because 0.41119...
is not in the index or a column name.
Here's what you can do:
df['Date'] = df['Date'].apply(pd.Timestamp)
# Get an year and quarter for each value for 'Date'
df['Quarter'] = df['Date'].apply(lambda x: x.quarter)
df['Year'] = df['Date'].apply(lambda x: x.year)
# Filter dataframe to get only data between 2018 and 2021
df = df[(df['Year'] > 2017) & (df['Year'] < 2022)]
quarterly = df.groupby(['Symbol', 'Year', 'Quarter'])
#
def compute_sr(x, risk_free_rate=0):
mean_return = x.mean()
std = x.std()
sharpe_ratio = (mean_return - risk_free_rate)/std
# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)
sr = quarterly['LogReturn'].apply(compute_sr)
print(sr)
Symbol Year Quarter
AAPL 2018 1 3.237754
GOOGL 2018 1 4.027694
Name: LogReturn, dtype: float64