I am working with a Pandas DataFrame containing a 20-year sample of monthly stock prices and corresponding returns for different stocks. After sorting the sample by 'Date' and by 'Book-To-Market' ('B/M', for short), I want to create a column that assigns a value of -1 to the stock returns of the companies with the lowest B/M and a value of 1 to the stock returns of the companies with the highest B/M (half of the sample assigned -1 and the other half assigned 1) for every month in the sample.
I tried creating DataFrames for every month in the sample and attribute the values thereon, but I found it unpractical. Please find code below:
df = df.sort_values (by = ['Date','B/M'], ascending = True)
df
# Desired outcome (Note: 'Date' - dtype='datetime64[ns] - has been set as the index).
Date Company Name B/M Monthly Return Signal
|2000-01-31 | ORACLE | 0.29 | 0.048 | -1 |
|2000-01-31 | MICROSOFT | 0.37 | 0.032 | 1 |
|... |... |... |... |... |
|2000-02-29 | MICROSOFT | 0.08 | 0.016 | -1 |
|2000-02-29 | ORACLE | 0.30 | -0.07 | 1 |
Thank you for your time.
CodePudding user response:
Try this
l=list()
for i in range(0,len(d),2):
if df.loc[i 1,"B/M"] - df.loc[i,"B/M"] < 0:
l.extend([1,-1])
else:
l.extend([-1,1])
df["Signal"] = l
df
Output:
Date Company Name B/M Monthly Return Signal
0 2000-01-31 ORACLE 0.29 0.048 -1
1 2000-01-31 MICROSOFT 0.37 0.032 1
2 2000-02-29 MICROSOFT 0.08 0.016 -1
3 2000-02-29 ORACLE 0.30 -0.070 1
CodePudding user response:
OK, please first calculate the median of B/M
bm_median=df["B/M"].median()
Then create a column where you check if the B/M is smaller than the median
df["signal"]=df["B/M"]<bm_median
Then if false, replace with 1 and if true replace with -1
df["signal"].replace(False,1,inplace=True)
df["signal"].replace(True,-1,inplace=True)