I have a dataframe with dates (30/09/2022 to 31/11/2022) and 15 stock prices (wrote 5 as reference) for each of these dates (excluding weekends).
Current Data:
DATES | A | B | C | D | E |
30/09/22 |100.5|151.3|233.4|237.2|38.42|
01/10/22 |101.5|148.0|237.6|232.2|38.54|
02/10/22 |102.2|147.6|238.3|231.4|39.32|
03/10/22 |103.4|145.7|239.2|232.2|39.54|
I wanted to get the Pearson correlation matrix, so I did this:
df = pd.read_excel(file_path, sheet_name)
df=df.dropna() #Remove dates that do not have prices for all stocks
log_df = df.set_index("DATES").pipe(lambda d: np.log(d.div(d.shift()))).reset_index()
corrM = log_df.corr()
Now I want to build the Pearson Uncentered Correlation Matrix, so I have the following function:
def uncentered_correlation(x, y):
x_dim = len(x)
y_dim = len(y)
xy = 0
xx = 0
yy = 0
for i in range(x_dim):
xy = xy x[i] * y[i]
xx = xx x[i] 2.0
yy = yy y[i]**2.0
corr = xy/np.sqrt(xx*yy)
return(corr)
However, I do not know how to apply this function to each possible pair of columns of the dataframe to get the correlation matrix.
CodePudding user response:
- First compute a list of possible column combinations. You can use the
itertools
library for that - Then use the
pandas.DataFrame.apply()
over multiple columns as explained here
Here is a simple code example:
import pandas as pd
import itertools
data = {'col1': [1,3], 'col2': [2,4], 'col3': [5,6]}
df = pd.DataFrame(data)
def add(num1,num2):
return num1 num2
cols = list(df)
combList = list(itertools.combinations(cols, 2))
for tup in combList:
firstCol = tup[0]
secCol = tup[1]
df[f'sum_{firstCol}_{secCol}'] = df.apply(lambda x: add(x[firstCol], x[secCol]), axis=1)
CodePudding user response:
try this? not elegant enough, but perhaps working for you. :)
from itertools import product
def iter_product(a, b):
return list(product(a, b))
df='your dataframe hier'
re_dict={}
iter_re=iter_product(df.columns,df.columns)
for i in iter_re:
result=uncentered_correlation(df[f'{i[0]}'],df[f'{i[1]}'])
re_dict[i]=result
re_df=pd.DataFrame(re_dict,index=[0]).stack()