Home > Mobile >  How do I create a column based on the values of two existing columns?
How do I create a column based on the values of two existing columns?

Time:10-02

I am working with a Pandas DataFrame containing a 20-year sample of monthly stock prices and corresponding returns for different stocks. After sorting the sample by 'Date' and by 'Book-To-Market' ('B/M', for short), I want to create a column that assigns a value of -1 to the stock returns of the companies with the lowest B/M and a value of 1 to the stock returns of the companies with the highest B/M (half of the sample assigned -1 and the other half assigned 1) for every month in the sample.

I tried creating DataFrames for every month in the sample and attribute the values thereon, but I found it unpractical. Please find code below:

df = df.sort_values (by = ['Date','B/M'], ascending = True)
df

# Desired outcome (Note: 'Date' - dtype='datetime64[ns] - has been set as the index).

    Date        Company Name       B/M      Monthly Return       Signal
|2000-01-31 |      ORACLE     |   0.29  |      0.048         |     -1     |
|2000-01-31 |    MICROSOFT    |   0.37  |      0.032         |      1     |
|...        |...              |...      |...                 |...         | 
|2000-02-29 |    MICROSOFT    |   0.08  |      0.016         |     -1     |
|2000-02-29 |     ORACLE      |   0.30  |      -0.07         |      1     |

Thank you for your time.

CodePudding user response:

Try this

l=list()
for i in range(0,len(d),2):
    if df.loc[i 1,"B/M"] - df.loc[i,"B/M"] < 0:
        l.extend([1,-1])
    else:
        l.extend([-1,1])
df["Signal"] = l
df

Output:

    Date        Company Name    B/M    Monthly Return   Signal
0   2000-01-31    ORACLE        0.29        0.048        -1
1   2000-01-31    MICROSOFT     0.37        0.032         1
2   2000-02-29    MICROSOFT     0.08        0.016        -1
3   2000-02-29    ORACLE        0.30        -0.070        1

CodePudding user response:

OK, please first calculate the median of B/M

bm_median=df["B/M"].median()

Then create a column where you check if the B/M is smaller than the median

df["signal"]=df["B/M"]<bm_median

Then if false, replace with 1 and if true replace with -1

df["signal"].replace(False,1,inplace=True)
df["signal"].replace(True,-1,inplace=True)
  • Related