How to speed up the calculation of a lot of small covariance in NumPy?-CodePudding

Is it possible to speed up small covariance calculations in NumPy? The function "diff_cov_ridge" is called millions of times in my program.

"theta" is a scalar, and "tx", "ty", "img1", "ix1", "iy1", "x1", "y1", "img2", "ix2", "iy2", "x2", "y2" are length n vectors.

def cov(a, b):
    return np.cov(a, b)[0, 1]

def diff_cov_ridge(theta, tx, ty, img1, ix1, iy1, x1, y1, img2, ix2, iy2, x2, y2):

    ct = np.cos(theta)
    st = np.sin(theta)
    eq1 = cov(img1, ix2*x2)
    eq2 = cov(img1, ix2*y2)
    eq3 = cov(img1, iy2*x2)
    eq4 = cov(img1, iy2*y2)
    eq5 = cov(img2, ix1*x1)
    eq6 = cov(img2, ix1*y1)
    eq7 = cov(img2, iy1*x1)
    eq8 = cov(img2, iy1*y1)
    eq9 = cov(ix2, ix1*tx*x1)
    eq10 = cov(ix1, ix2*tx*x2)
    eq11 = cov(ix1*y1, ix2*tx)
    eq12 = cov(ix1, ix2*tx*y2)
    eq13 = cov(ix1*x1, ix2*x2)
    eq14 = cov(ix1*x1, ix2*y2)
    eq15 = cov(ix1*y1, ix2*x2)
    eq16 = cov(ix1*y1, ix2*y2)
    eq17 = cov(ix1, iy2*tx*x2)
    eq18 = cov(ix1, iy2*tx*y2)
    eq19 = cov(ix1*x1, iy2*ty)
    eq20 = cov(ix1*y1, iy2*ty)
    eq21 = cov(ix1*x1, iy2*x2)
    eq22 = cov(ix1*x1, iy2*y2)
    eq23 = cov(ix1*y1, iy2*x2)
    eq24 = cov(ix1*y1, iy2*y2)
    eq25 = cov(ix2, iy1*tx*x1)
    eq26 = cov(ix2, iy1*tx*y1)
    eq27 = cov(iy1, ix2*ty*x2)
    eq28 = cov(iy1, ix2*ty*y2)
    eq29 = cov(ix2*x2, iy1*x1)
    eq30 = cov(ix2*y2, iy1*x1)
    eq31 = cov(ix2*x2, iy1*y1)
    eq32 = cov(ix2*y2, iy1*y1)
    eq33 = cov(iy1*x1, iy2*ty)
    eq34 = cov(iy1, iy2*ty*x2)
    eq35 = cov(iy1*y1, iy2*ty)
    eq36 = cov(iy1, iy2*ty*y2)
    eq37 = cov(iy1*x1, iy2*x2)
    eq38 = cov(iy1*x1, iy2*y2)
    eq39 = cov(iy1*y1, iy2*x2)
    eq40 = cov(iy1*y1, iy2*y2)

CodePudding user response：

The definition of np.cov(a, b)[0, 1] is simply

np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)

You can therefore avoid the computation of the diagonal elements and the indexing into a 2x2 matrix, which should speed up your computation by a factor of somewhere between 1.5x and 3x. A slightly faster formulation is

np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)

Here is an informal timing test on very small (a.size == 10) arrays that shows the differences:

%timeit np.cov(a, b)[0, 1]
39.3 µs ± 751 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.sum((a - np.mean(a)) * (b - np.mean(b))) / (a.size - 1)
23.7 µs ± 370 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.dot(a - a.mean(), b - b.mean()) / (a.size - 1)
18 µs ± 83.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

I strongly suspect that using the above formulations, you can pre-compute some of the quantities you need to avoid calling cov so many times.

You can break up the computation of covariance in the same way you do with variance:

((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)

This gives an additional factor of 2x speedup:

%timeit ((a * b).sum() - a.sum() * b.sum() / a.size) / (a.size - 1)
8.03 µs ± 41.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The additional advantage here is that you can pre-compute many of your sums. For example, img1 appears in 4 of your equations, but you only need to compute img1.sum() once for all of them.