Home > Net >  Numpy - Covariance between row of two matrix
Numpy - Covariance between row of two matrix

Time:09-21

I need to compute the covariance between each row of two different matrices, i.e. the covariance between the first row of the first matrix with the first row of the second matrix, and so on till the last row of both matrices. I can do it without NumPy with the code attached below, my question is: is it possible to avoid the use of the "for loop" and get the same result with NumPy?

m1 = np.array([[1,2,3],[2,2,2]])
m2 = np.array([[2.56, 2.89, 3.76],[1,2,3.95]])

output = []
for a,b in zip(m1,m2):
    cov = np.cov(a, b)
    output.append(cov[0][1])
print(output)

Thanks in advance!

CodePudding user response:

If you are handling big arrays, I would consider this:

from numba import jit
import numpy as np


m1 = np.random.rand(10000, 3)
m2 = np.random.rand(10000, 3)

@jit(nopython=True) 
def nb_cov(a, b): 
    return [np.cov(x)[0,1] for x in np.stack((a, b), axis=1)]

To get a runtime of

>>> %timeit nb_cov(m1, m2)
The slowest run took 94.24 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 5: 10.5 ms per loop

Compared with

>>> %timeit [np.cov(x)[0,1] for x in np.stack((m1, m2), axis=1)]
1 loop, best of 5: 410 ms per loop  

CodePudding user response:

You could use a list comprehension instead of a for loop, and you could eliminate zip (if you wanted to) by concatenating the two arrays along a third dimension.

import numpy as np

m1 = np.array([[1,2,3],[2,2,2]])
m2 = np.array([[2.56, 2.89, 3.76],[1,2,3.95]])

# List comprehension on zipped arrays.
out2 = [np.cov(a, b)[0][1] for a, b in zip(m1, m2)]
print(out2)
# [0.5999999999999999, 0.0]

# List comprehension on concatenated arrays.
big_array = np.concatenate((m1[:, np.newaxis, :],
                            m2[:, np.newaxis, :]), axis=1)

out3 = [np.cov(X)[0][1] for X in big_array]
print(out3)
# [0.5999999999999999, 0.0]
  • Related