NumPy vectorization without the use of numpy.vectorize-CodePudding

I've found myself using NumPy arrays for memory management and speed of computation more and more lately, on large volumes of structured data (such as points and polygons). In doing so, there is always a situation where I need to perform some function f(x) on the entire array. From experience, and Googling, iterating over the array is not the way to do this, so insted a function should be vectorized and broadcast to the entire array.

Looking at the documentation for numpy.vectorize we get this example:

def myfunc(a, b):
    "Return a-b if a>b, otherwise return a b"
    if a > b:
        return a - b
    else:
        return a   b

>>> vfunc = np.vectorize(myfunc)
>>> vfunc([1, 2, 3, 4], 2)
array([3, 4, 1, 2])

And per the docs it really just creates a for loop so it doesnt access the lower C loops for truly vectorized operations (either in BLAS or SIMD). So that got me wondering, if the above is "vectorized", what is this?:

def myfunc_2(a, b):
    cond = a > b

    a[cond] -= b
    a[~cond]  = b

    return a

>>> myfunc_2(np.array([1, 2, 3, 4], 2))
array([3, 4, 1, 2])

Or even this:

>>> a = np.array([1, 2, 3, 4]
>>> b = 2
>>> np.where(a > b, a - b, a   b)
array([3, 4, 1, 2])

So I ran some tests on these, what I believe to be comparable examples:

>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import vfunc, arr'
>>> timeit('vfunc(arr, 50)', setup=setup, number=1)
0.60175449999997
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_2, arr'
>>> timeit('myfunc_2(arr, 50)', setup=setup, number=1)
0.07464979999997468
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_3, arr'
>>> timeit('myfunc_3(arr, 50)', setup=setup, number=1)
0.0222587000000658

And with larger run windows:

>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import vfunc, arr'
>>> timeit('vfunc(arr, 50)', setup=setup, number=1000)
621.5853878000003
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_2, arr'
>>> timeit('myfunc_2(arr, 50)', setup=setup, number=1000)
98.19819199999984
>>> arr = np.random.randint(200, size=(1000000,))
>>> setup = 'from __main__ import myfunc_3, arr'
>>> timeit('myfunc_3(arr, 50)', setup=setup, number=1000)
26.128515100000186

Clearly the other options are major improvements over using numpy.vectorize. This leads me to wonder several things about why anybody would use numpy.vectorize at all if you can write what appear to be "purely vectorized" functions or use battery provided functions like numpy.where.

Now for the questions:

What are the requirements to say a function is "vectorized" if not converted via numpy.vectorize? Just broadcastable in its entirety?
How does NumPy determine if a function is "vectorized"/broadcastable?
Why isn't this form of vectorization documented anywhere? (i.e., why doesn't NumPy have a "How to write a vectorized function" page?

CodePudding user response：

Vectorize Documentation

The documentation provides a great example in def mypolyval(p, x):: there's no good way to write that as a where condition or using simple logic.

def mypolyval(p, x):
    _p = list(p)
    res = _p.pop(0)
    while _p:
        res = res*x   _p.pop(0)
    return res

vpolyval = np.vectorize(mypolyval, excluded=['p'])
vpolyval(p=[1, 2, 3], x=[0, 1])
array([3, 6])

That is, np.vectorize is clearly what the reference documentation states: convenience to write code in the same fashion even without the performance benefits.

And as for the documentation telling you how to write vectorized code, it does though in the relevant documentation. It says in the documentation what you mentioned above:

The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.

Remember: the documentation is an API reference guide with some additional caveats: it's not a NumPy tutorial.

UFunc Documentation

The appropriate reference documentation and glossary document this clearly:

A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features. That is, a ufunc is a “vectorized” wrapper for a function that takes a fixed number of specific inputs and produces a fixed number of specific outputs. For detailed information on universal functions, see Universal functions (ufunc) basics.

NumPy hands off array processing to C, where looping and computation are much faster than in Python. To exploit this, programmers using NumPy eliminate Python loops in favor of array-to-array operations. vectorization can refer both to the C offloading and to structuring NumPy code to leverage it.

Summary

Simply put, np.vectorize is for code legibility so you can write similar code to actually vectorized ufuncs. It is not for performance, but there are times when you have no good alternative.

CodePudding user response：

"vectorization" can mean be different things depending on context. Use of low level C code with BLAS or SIMD is just one.

In physics 101, a vector represents a point or velocity whose numeric representation can vary with coordinate system. Thus I think of "vectorization", broadly speaking, as performing math operations on the "whole" array, without explicit control over numerical elements.

numpy basically adds a ndarray class to python. It has a large number of methods (and operators and ufunc) that do indexing and math in compiled code (not necessarily using processor specific SIMD). The big gain in speed, relative to Python level iteration, is the use of compiled code optimized for the ndarray data structure. Python level iteration (interpreted code) on arrays is actually slower than on equivalent lists.

I don't think numpy formally defines "vectorization". There isn't a "vector" class. I haven't searched the documentation for those terms. Here, and possibly on other forums, it just means, writing code that makes optimal use of ndarray methods. Generally that means avoiding python level iteration.

np.vectorize is a tool for applying arrays to functions that only accept scalar inputs. It doesn't not compile or otherwise "look inside" that function. But it does accept and apply arguments in a fully broadcasted sense, such as in:

In [162]: vfunc(np.arange(3)[:,None],np.arange(4))
Out[162]: 
array([[0, 1, 2, 3],
       [1, 2, 3, 4],
       [2, 1, 4, 5]])

Speedwise np.vectorize is slower than the equivalent list comprehension, at least for smaller sample cases. Recent testing shows that it scales better, so for large inputs it may be better. But still the performance is nothing like your myfunc_2.

myfunc is not "vectorized" simply because expressions like if a > b do not work with arrays.

np.where(a > b, a - b, a b) is "vectorized" because all arguments to the where work with arrays, and where itself uses them with full broadcasting powers.

In [163]: a,b = np.arange(3)[:,None], np.arange(4)
In [164]: np.where(a>b, a-b, a b)
Out[164]: 
array([[0, 1, 2, 3],
       [1, 2, 3, 4],
       [2, 1, 4, 5]])

myfunc_2 is "vectorized", at least in a:

In [168]: myfunc_2(a,2)
Out[168]: 
array([[4],
       [1],
       [2]])

It does not work when b is array; it's trickier to match the a[cond] shape with anything but a scalar:

In [169]: myfunc_2(a,b)
Traceback (most recent call last):
  Input In [169] in <cell line: 1>
    myfunc_2(a,b)
  Input In [159] in myfunc_2
    a[cond] -= b
IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 4

===

What are the requirements to say a function is "vectorized" if not converted via numpy.vectorize? Just broadcastable in its entirety?

In your examples, my_func is not "vectorized" because it only works with scalars. vfunc is full "vectorized", but not faster. where is also "vectorized" and (probably) faster, though this may be scale dependent. my_func2 is only "vectorized" in a.

How does NumPy determine if a function is "vectorized"/broadcastable?

numpy doesn't determine anything like this. numpy is a ndarray class with many methods. It's just the use of those methods that makes a block of code "vectorized".

Why isn't this form of vectorization documented anywhere? (i.e., why doesn't NumPy have a "How to write a vectorized function" page?

Keep in mind the distinction between "vectorization" as a performance strategy, and the basic idea of operating on whole arrays.