Let's assume matrix X and Y of size 2x3 and 2x2, respectively. Function 'cor' in R returns a 3x2 matrix while function numpy.corrcoef in Python return a 5x5 matrix. Examples below:
R:
X<-matrix(c(0.2,0.5,0.1,0.7,0.5,0.3), nrow=2, ncol=3)
Y<-matrix(c(0.2,0.3,0.6,0.7), nrow=2)
cor(X,Y)
[,1] [,2]
[1,] 1 1
[2,] 1 1
[3,] -1 -1
Python:
X = np.array([[0.2,0.5], [0.1, 0.7], [0.5,0.3]], ndmin=2).T
Y = np.array([[0.2,0.3],[0.6,0.7]], ndmin=2).T
corr = np.corrcoef(X, Y, rowvar=False)
array([[ 1., 1., -1., 1., 1.],
[ 1., 1., -1., 1., 1.],
[-1., -1., 1., -1., -1.],
[ 1., 1., -1., 1., 1.],
[ 1., 1., -1., 1., 1.]])
How to get python to return a 3x2 matrix like in R ? Or how should I select the correct values in Python's 5x5 matrix so it matches R's result ?
CodePudding user response:
In R, when x
and y
are matrices, cor(x,y)
will return correlation of columns of x
(n=3) with columns of y
(n=2). In python, you can slice the result of np.corrcoef()
using the correct indices, which in this case are 3, and 2, for x
(rows) and y
(columns), respectively, of the result.
np.corrcoef(X,Y)[0:3,0:2]
array([[ 1., 1.],
[ 1., 1.],
[-1., -1.]])