Cases where numpy `mean` is more computationally efficient vs executing the math via python code-CodePudding

This is a followup to this SO answer

https://stackoverflow.com/a/71185257/3259896

Moreover, note that mean is not very optimized for such a case. It is faster to use (a[b[:,0]] a[b[:,1]]) * 0.5 although the intent is less clear.

This is further elaborated in the comments

mean is optimized for 2 cases: the computation of the mean of contiguous lines along the last contiguous axis OR the computation of the mean of many long contiguous lines along a non-contiguous axis.

I looked up contiguous arrays and found it explained here

What is the difference between contiguous and non-contiguous arrays?

It means stored in unbroken blocks of memory.

However, it is still not clear to me if there's any solid cases where I should use mean over just performing the calculations in python.

I would love to have some solid examples of where and when to use each type of operation.

CodePudding user response：

While I've worked with numpy for a long time, I still have to do timings. I can predict some comparisons, but not all. In addition there's a matter of scaling. Your previous example was relatively small.

With the (5,3) and (3,2) a and b:

In [145]: timeit np.add(a[b[:,0]],a[b[:,1]])/2
17.8 µs ± 24.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [146]: timeit (a[b[:,0]] a[b[:,1]])/2
17.9 µs ± 302 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [147]: timeit (a[b[:,0]] a[b[:,1]])/2
17.8 µs ± 18.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [148]: timeit np.add(a[b[:,0]],a[b[:,1]])/2
18 µs ± 6.43 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [149]: timeit np.add.reduce(a[b],1)/2
19.3 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [150]: timeit np.sum(a[b],1)/2
25.1 µs ± 309 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [151]: timeit np.mean(a[b],1)
35.9 µs ± 853 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [152]: timeit a[b].mean(1)
29.4 µs ± 658 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [153]: timeit a[b].sum(1)/2
20.9 µs ± 885 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

While a[b[:,0]] a[b[:,1]] is fastest, you probably don't want to expand that when b is (n, 5).

Note all these alternatives make full use of numpy array methods.

What you want to watch out for is using list like iterations on an array, or performing array operations on lists, especially small ones. Making an array from a list takes time. Iterating on elements of an array is slower than iterating on the elements of a list.