NumPy array computation time questions-CodePudding

Sorry if this question is a little too basic.

I was learning NumPy and learned through Why are NumPy arrays so fast? that NumPy arrays are fast because of two things

np.array() are composed of the same dtype and hence get the benefits of locality of reference.
Vectorized operations are possible via things like uFunc.

Therefore, it stands to reason that the computation speed should be:

(uFuncs with np arrays)>(loops with np arrays)>(loops with list).

However, the code below showed that the speeds were actually,

(uFuncs with np arrays)>(loops with list)>(loops with np arrays).

in other words, although using uFUncs with np arrays is the fastest, loops with lists were faster than loops with np arrays. Could anyone explain why this is so?

Thank you in advance for your answer!

import numpy as np
np.random.seed(0)

arr=np.arange(1,1000)
arrlist=[]
for i in range(1,1000):
  arrlist.append(i)

#case0: list loop
def list_loop(values): #values: 1D array I want to take reciprocal of
  output=[]
  for i in range(len(values)):
    output.append(1.0/values[i])
  return output


#case1: np array loop
def nparray_loop(values): #values: 1D array I want to take reciprocal of
  output=np.empty(len(values))
  for i in range(len(values)):
    output[i]=1.0/values[i]
  return output

#case2: uFunc
def nparray_uFunc(values):
  return 1.0/values

print("list loop computation time:")
%timeit list_loop(arrlist)
print('\n')

print("np array loop computation time:")
%timeit nparray_loop(arr)
print('\n')

print("uFunc computation time:")
%timeit nparray_uFunc(arr)

Output:

list loop computation time:
185 µs ± 5.63 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


np array loop computation time:
4.28 ms ± 402 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


uFunc computation time:
5.42 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

CodePudding user response：

A list is a represented internally as a vector of Python objects. A numpy array is a complex data structure in which the actual numeric data is stored in raw machine format.

To call a Python function on a python list, Python just plucks the appropriate element from the list and calls the function. To call a Python function on a numpy array, Python must take each individual raw element of the numpy array and convert it to the appropriate Python representation, and then call the Python function.

Numpy is fast for vector operations. If you're planning on looking at the data element by element, it is less efficient.

CodePudding user response：

Your thinking about data locality is right, but it's not the full picture.

A Python list or NumPy array with dtype = object holds references; each element is pointing to an external data object. However, a NumPy array is also able to have numerical dtypes; each element stores the data directly.

Unlike numerical dtypes, a reference does not tell you what type of object it is pointing at. In the loop over a Python list like [1, 0.5, 2], you need to check the type of values[i] and select the proper / for each and every iteration (float-int or float-float division). With vectorized code 1.0/values on a NumPy array with a numerical dtype np.array([1, 0, 2]), you check the array's dtype numpy.int64 and find the proper / only once before iterating. This saves a LOT of time as the iterations add up.

Now, why would a loop over a NumPy array with a numerical dtype be slower than the equivalent vectorized code? Some languages do type inference during compilation, so they can tell that values is a NumPy array and that values[i] has to be a numpy.int64 value every iteration. However, Python does not do type inference and instead chooses to check the type of values[i] every iteration, completely unaware that it's getting the same answer numpy.int64 every time. To get around this lack of type inference in Python loops, NumPy's vectorized code allows Python to use C loops (that's not to say you can just switch to C for the same speed, NumPy is heavily optimized).

Now, why is the loop on the NumPy array slower than the loop on a Python list? That's because values[i] on a NumPy array creates a new numpy.int64 scalar object, whereas values[i] on a Python list just uses an already existing object. Creating and discarding an object every iteration takes time, and it may take more time when all these discarded objects trigger garbage collection.