How to compute financial ratio based on specific date range for multiple stocks dataframe?-CodePudding

I am trying to write a code that will compute Sharpe ratio for a list of stocks for every calendar quarter. Here is a sample of my dataframe (df):

DataFrame

So, firstly, I wrote a function that calculates a Sharpe ratio:

def compute_sr(df, risk_free_rate=0):

mean_return = df.mean()
std = df.std()
sharpe_ratio = (mean_return - risk_free_rate)/std

# quarterly annualized Sharpe Ratio
return sharpe_ratio * np.sqrt(62)

Then, I wrote a code that calculates a Sharpe ratio for every stock that includes in a list of stocks, based on calendar quarter, and adding the result in a new column:

start = '2018-01-01'

end = '2021-12-21'

list_of_stocks = ['AAPL', 'MSFT', 'FB']

date_range = pd.date_range(start=start, end=end)

for stock in list_of_stocks:
df['Sharpe Ratio'] = df[df['Symbol'] == stock]['Log Return'][compute_sr(df.loc[date_range - pd.offsets.DateOffset(months=3): date_range, 'Log Return']) 
         for date_range in df.index]

Unfortunately, I got the following syntax error:

File "<ipython-input-211-a78d09294412>", line 3
for date_range in trading_data.index]
^
SyntaxError: invalid syntax

What is the reason for the syntax error?

Thanks in advance.

CodePudding user response：

I think you have three problems here: what you think you are doing with dates, your compute_sr function and this messy thing inside your stocks loop.

First, the dates:

When you do date_range = pd.date_range(start=start, end=end), you get and array. So, you won't be able to select data with .loc and get the dateframe you want to pass as an argument to your compute_sr function.

You have:

df.loc[an_array: another_array, 'Log Return']

Pandas can't figure out where to start and where to end the index of the dataframe to be returned. You neeed:

df.loc[a_start_date:an_end_date, 'Log Return']

But let's assume you will get a dataframe, that you would pass to compute_sr. Here's what it is doing:

df = pd.DataFrame({'Date':['2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09','2018-01-03','2018-01-04','2018-01-05','2018-01-08','2018-01-09'], 'Symbol':['AAPL','AAPL','AAPL','AAPL','AAPL','GOOGL''GOOGL','GOOGL','GOOGL','GOOGL'], 'AdjClose':[41.12,41.31,41.79,41.63,41.63,2265.89,2465.89,2625.89,2565.89,2165.89],'LogReturn':[-0.000175,0.004634,0.011321,-0.003721,-0.000115,-0.003452,0.004111,0.032111,0.003721,-0.000115]})

# calculate_sr
mean_return = df[df['Symbol'] == stock]['LogReturn'].mean()
std = df[df['Symbol'] == stock]['LogReturn'].std()
sharpe_ratio = (mean_return - 0)/std
print(sharpe_ratio)

0.4111951194083935

So, that messy line is in fact:

df1 = df[df['Symbol'] == stock]['Log Return']
# compute_sr will resolve to `sharpe_ratio`
sharpe_ratio = 0.4111951194083935
df1[sharpe_ratio]

This will return a type error because 0.41119... is not in the index or a column name.

Here's what you can do:

df['Date'] = df['Date'].apply(pd.Timestamp)
# Get an year and quarter for each value for 'Date'
df['Quarter'] = df['Date'].apply(lambda x: x.quarter)
df['Year'] = df['Date'].apply(lambda x: x.year)

# Filter dataframe to get only data between 2018 and 2021
df = df[(df['Year'] > 2017) & (df['Year'] < 2022)]
quarterly = df.groupby(['Symbol', 'Year', 'Quarter'])

# 
def compute_sr(x, risk_free_rate=0):
    mean_return = x.mean()
    std = x.std()
    sharpe_ratio = (mean_return - risk_free_rate)/std

    # quarterly annualized Sharpe Ratio
    return sharpe_ratio * np.sqrt(62)

sr = quarterly['LogReturn'].apply(compute_sr)
print(sr)

Symbol  Year  Quarter
AAPL    2018  1         3.237754
GOOGL   2018  1         4.027694
Name: LogReturn, dtype: float64