Gradient implementation in numpy-CodePudding

I have tested the timing of the built-in np.gradient function with respect to an hard-coded function which compute the gradient using the same central difference scheme. By using the cProfile module I have found significantly smaller time of np.gradient. This suggest that my implementation of the gradient computation in Python is extremely naive. I would like to understand how np.gradient is implement so to understand how to properly implement math operation in python. (I am aware that the hard-coded function does not exactly compute the gradient as np.gradient, since the boundary values are not included, but this should not affect the results).

Here is the python code I have used for the test:

import numpy as np
import cProfile

# Create ndarray
N = 1000
f = np.empty((N,N))

# Function to test the built-in gradient function of numpy
def test_grad_np(T, n):
   for s in range(n):
       dTdx, dTdy = np.gradient(T)

# Hard-coded gradient function
def hc_grad(T):
   Ny, Nx = np.shape(T)
   dTdx = np.zeros((Ny, Nx))
   dTdy = np.zeros((Ny, Nx))
   for j in range(1,Nx-1):
       for i in range(1,Nx-1):
           dTdx[j,i] = (T[j,i 1] - T[j,i-1])/2.
           dTdy[j,i] = (T[j 1,i] - T[j-1,i])/2.
   return dTdx, dTdy

# Function to test the hard-coded gradient function
def test_hc_grad(T, n):
   for s in range(n):
       dTdx, dTdy = hc_grad(T)

cProfile.run('test_hc_grad(T, 20)')
cProfile.run('test_grad_np(T, 20)')

and this is the output on my machine:

        144 function calls in 16.818 seconds

  Ordered by: standard name

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      20    0.000    0.000    0.000    0.000 <__array_function__ internals>:2(shape)
       1    0.001    0.001   16.818   16.818 <string>:1(<module>)
      20    0.000    0.000    0.000    0.000 fromnumeric.py:1922(_shape_dispatcher)
      20    0.000    0.000    0.000    0.000 fromnumeric.py:1926(shape)
      20   16.803    0.840   16.804    0.840 test_speed.py:20(hc_grad)
       1    0.013    0.013   16.817   16.817 test_speed.py:30(test_hc_grad)
       1    0.000    0.000   16.818   16.818 {built-in method builtins.exec}
      20    0.000    0.000    0.000    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
      40    0.001    0.000    0.001    0.000 {built-in method numpy.zeros}
       1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


        704 function calls (624 primitive calls) in 0.262 seconds

  Ordered by: standard name

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      40    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(empty_like)
      20    0.000    0.000    0.248    0.012 <__array_function__ internals>:2(gradient)
      40    0.000    0.000    0.001    0.000 <__array_function__ internals>:2(ndim)
       1    0.001    0.001    0.262    0.262 <string>:1(<module>)
      20    0.000    0.000    0.000    0.000 _asarray.py:110(asanyarray)
      40    0.000    0.000    0.000    0.000 _asarray.py:23(asarray)
      40    0.000    0.000    0.000    0.000 fromnumeric.py:3102(_ndim_dispatcher)
      40    0.000    0.000    0.001    0.000 fromnumeric.py:3106(ndim)
      40    0.000    0.000    0.000    0.000 function_base.py:798(_gradient_dispatcher)
      20    0.245    0.012    0.247    0.012 function_base.py:803(gradient)
      40    0.000    0.000    0.000    0.000 multiarray.py:75(empty_like)
      40    0.000    0.000    0.000    0.000 numerictypes.py:285(issubclass_)
      20    0.000    0.000    0.000    0.000 numerictypes.py:359(issubdtype)
       1    0.014    0.014    0.261    0.261 test_speed.py:15(test_grad_np)
       1    0.000    0.000    0.262    0.262 {built-in method builtins.exec}
      60    0.000    0.000    0.000    0.000 {built-in method builtins.issubclass}
      40    0.000    0.000    0.000    0.000 {built-in method builtins.len}
      60    0.000    0.000    0.000    0.000 {built-in method numpy.array}
  100/20    0.001    0.000    0.248    0.012 {built-in method numpy.core._multiarray_umath.implement_array_function}
      40    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
       1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

CodePudding user response：

If you're really curious, why not have a look at the np.gradient source code?

I'm not an expert on np.gradient, but just from playing around with your current implementation, a lot can be improved by using vectorial code. I believe the following should be equivalent:

def hc_grad(T):
    dTdx = np.zeros_like(T)
    dTdy = np.zeros_like(T)

    dTdx[1:-1, 1:-1] = (T[1:-1, 2:] - T[1:-1, :-2]) / 2
    dTdy[1:-1, 1:-1] = (T[2:, 1:-1] - T[:-2, 1:-1]) / 2

which already runs a lot faster on my computer.