Home > Software engineering >  How to get the Pearson correlation between matrices
How to get the Pearson correlation between matrices

Time:06-14

This is the Python analog to this question asked about R. In summary, I have to numpy matrices of identical shape and I want to get their Pearson correlation. I just need one number. Feeding the matrices to np.corrcoef produces just another matrix for each position. Flattening the matrices into one line array (similar to what was suggested in R) also provides one form of matrix.

a = np.matrix('1 2 3; 3 4 5; 1 2 4')
b = np.matrix('1 2 3; 4 3 5; 3 4 5')

np.corrcoef(a.flatten(),b.flatten())
np.corrcoef(np.squeeze(np.asarray(a)), np.squeeze(np.asarray(b)))

How can I get the correlation between to matrices as a single number?

---EDIT---

A more advanced and comprehensive version of this questions would be: How to get a correlation matrix containing the correlation of several matrices? For example:

a = np.matrix('1 2 3; 3 4 5; 1 2 4')
b = np.matrix('1 2 3; 4 3 5; 3 4 5')
c = np.matrix('1 2 3; 4 3 5; 3 4 5')

To produce something like
matrix([[1.        , 0.72280632, 1.        ],
       [0.72280632, 1.        ,1.        ],
       [1.        , 1.        ,1.        ]])

CodePudding user response:

IIUC, you can stack and reshape:

l = [a, b, c]

# only if matrices
l = list(map(np.asarray, l))

x = np.stack(l).reshape(len(l), -1)

np.corrcoef(x)

Output:

array([[1.        , 0.72280632, 0.72280632],
       [0.72280632, 1.        , 1.        ],
       [0.72280632, 1.        , 1.        ]])

CodePudding user response:

another option is to use list comprehension:

np.corrcoef([np.array(i).flatten() for i in [a,b,c]])

array([[1.        , 0.72280632, 0.72280632],
       [0.72280632, 1.        , 1.        ],
       [0.72280632, 1.        , 1.        ]])
  • Related