I have a dataframe containing m series of stock prices for different companies like the one below:
Date Stock 1 Stock 2 Stock 3 ... Stock m
1 100 200 300 ... 500
2 500 300 200 ... 100
: : : : ... :
n 200 300 400 ... 100
What I need to do is find out which of the m stocks has the highest variance and then swap it with the entries of Stock 1
. Assume that in the above example Stock 3
is the one with the highest variance, the final output should then be:
Date Stock 3 Stock 2 Stock 1 ... Stock m
1 300 200 100 ... 500
2 200 300 500 ... 100
: : : : ... :
n 400 300 200 ... 100
In order to find the column with highest variance I tried computing:
print(max(df.var()))
However, this only yields the amount of variance without printing the stock ticker. How can I fix this?
CodePudding user response:
You can use .insert()
for inserting column at the specified position after you get the locations from .idxmax()
and Index.get_loc()
:
col = df.set_index('Date').var().idxmax() # get column name of max var
from_pos = df.columns.get_loc(col) # get location of column with max var
stock1_col = df.columns[1] # get column name of first stock
stock1 = df.pop(stock1_col) # take out column of first stock
df.insert(1, col, df.pop(col)) # insert column of max var to the original position of first stock
df.insert(from_pos, stock1_col, stock1) # insert first stock to the original position of column with max var
Result:
(Stock m
is the column with max variance in the sample data)
print(col)
'Stock m'
print(df)
Date Stock m Stock 2 Stock 3 Stock 1
0 1 500 200 300 100
1 2 100 300 200 500
2 n 100 300 400 200