Home > front end >  Why are variable assignments faster than calling from arrays in python?
Why are variable assignments faster than calling from arrays in python?

Time:09-21

I've been working on optimizing some Euclidean distance transform calculations for a program that I'm building. To preface, I have little formal training in computer science other than some MOOCs I've been taking.

I've learned through empirical testing in Python that assigning values to individual variables and performing operations on them is faster than performing operations on arrays. Is this observation reproducible for others?

If so, could someone provide a deeper explanation as to why there are such speed differences between these two forms of syntax?

Please see some example code below.

import numpy as np
from math import sqrt
import time

# Numpy array math
def test1(coords):
    results = []
    for coord in coords:
        mins = np.array([1,1,1])
        # The three lines below seem faster than np.linalg.norm()
        mins = (coord - mins)**2
        mins = np.sum(mins) 
        results.append(sqrt(mins))
   
# Individual variable assignment math     
def test2(coords):
    results = []
    for point in coords:
        z, y, x = 1, 1, 1
        z = (point[0] - z)**2
        y = (point[1] - y)**2
        x = (point[2] - x)**2
        mins = sqrt(z   y   x)
        results.append(mins)
        
a = np.random.randint(0, 10, (500000,3))

t = time.perf_counter()
test1(a)
print ("Test 1 speed:", time.perf_counter() - t)

t = time.perf_counter()
test2(a)
print ("Test 2 speed:", time.perf_counter() - t)
  • Test 1 speed: 3.261552719 s
  • Test 2 speed: 0.716983475 s

CodePudding user response:

Python operations and memory allocations are generally much slower than Numpy's highly optimized, vectorized array operations. Since you are looping over the array and allocating memory, you don't get any of the benefits that Numpy offers. It's especially bad in your first one because it causes an undue number of allocations of small arrays.

Compare your code to one that offloads all the operations to Numpy instead of having Python do the operations one by one:

def test3(coords):
    mins = (coords - 1)**2
    results = np.sqrt(np.sum(mins, axis=1))
    return results

On my system, this results in:

Test 1 speed: 4.995761550962925
Test 2 speed: 1.3881473205983639
Test 3 speed: 0.05562112480401993
  • Related