Fastest way to get dot product of a matrix and each point in the given array-CodePudding

I have an array Nx3 of N points, each one has X, Y and Z coordinate. I need to rotate each point, so i have the rotation matrix 3x3. To do this, i need to get dot product of the rotation matrix and each point. The problem is the array of points is quite massive (~1_000_000x3) and therefore it takes noticeable amount of time to compute the rotated points.

The only solution i come up with so far is simple for loop iterating over array of points. Here is the example with a piece of data:

# Given set of points Nx3: np.ndarray
points = np.array([
     [285.679, 219.75,  524.733],
     [285.924, 219.404, 524.812],
     [285.116, 219.217, 524.813],
     [285.839, 219.557, 524.842],
     [285.173, 219.507, 524.606]
        ])
points_t = points.T 
# Array to store results
points_rotated = np.empty((points_t.shape))
# Given rotation matrix 3x3: np.ndarray
rot_matrix = np.array([
        [0.57423549, 0.81862056, -0.01067613],
        [-0.81866133, 0.57405696, -0.01588193],
        [-0.00687256, 0.0178601, 0.99981688]
            ])

for i in range(points.shape[0]):
    points_rotated[:, i] = np.dot(rot_matrix, points_t[:, i])

# Result
points_rotated.T 
# [[ 338.33677093 -116.05910163  526.59831864]
#  [ 338.1933725  -116.45955203  526.6694408 ]
#  [ 337.5762975  -115.90543822  526.67265381]
#  [ 338.26949115 -116.30261156  526.70275207]
#  [ 337.84863885 -115.78233784  526.47047941]]

I dont feel confident using numpy as i quite new to it, so i'm sure there is at least more elegant way to do. I've heard about np.einsum() but i don't understand yet how to implement it here and i'm not sure it will make it quicker. And the main problem is still the computation time, so i want to now how to make it work faster?

Thank you very much in advance!

CodePudding user response：

You can write the code using numba parallelization in no python mode as:

@nb.njit("float64[:, ::1](float64[:, ::1], float64[:, ::1])", parallel=True)
def dot(a, b):  # dot(points, rot_matrix)

    dot_ = np.zeros((a.shape[0], b.shape[0]))
    for i in nb.prange(a.shape[0]):
        for j in range(b.shape[0]):
            dot_[i, j] = a[i, 0] * b[j, 0]   a[i, 1] * b[j, 1]   a[i, 2] * b[j, 2]
    return dot_

It must be better than ko3 answer due to parallelization and signatures and using algebra instead np.dot. I have tested similar code (that was applied just on upper triangle of an array) instead dot product in another answer which showed that was at least 2 times faster (as far as I can remember).

CodePudding user response：

If the number of iterations is large, you could try out numba's njit method:

from numba import njit

points = np.array([
     [285.679, 219.75,  524.733],
     [285.924, 219.404, 524.812],
     [285.116, 219.217, 524.813],
     [285.839, 219.557, 524.842],
     [285.173, 219.507, 524.606]
        ])

rot_matrix = np.array([
        [0.57423549, 0.81862056, -0.01067613],
        [-0.81866133, 0.57405696, -0.01588193],
        [-0.00687256, 0.0178601, 0.99981688]
            ]) 

@njit
def compute(points, rot_matrix):
    points_t = points.T 
    points_rotated = np.empty((points_t.shape))
    for i in range(points.shape[0]):
        points_rotated[:, i] = np.dot(rot_matrix, points_t[:, i])    
    return points_rotated.T 

compute(points, rot_matrix)

It is especially designed for large iterations and should accelerate the operation by a lot. Check out the documentation for more.

CodePudding user response：

I have tried some codes to see the performance of each solution (based on my system) on 1,000,000 x 3 matrix.

Using np.matmul
Using numba njit with dot product(@)
Using numba njit with your code

Here are the results of three functions.

points = np.random.rand(1000000,3)
rot_matrix = np.array([
        [0.57423549, 0.81862056, -0.01067613],
        [-0.81866133, 0.57405696, -0.01588193],
        [-0.00687256, 0.0178601, 0.99981688]


# Using np.matmul
def function_matmul(points, rot_matrix):
    return np.matmul(rot_matrix @ points.T).T

# Using numba njit with dot product(@)
@njit
def function_numba_dot(points, rot_matrix):
    return (rot_matrix, points.T).T

# Using numba njit with your code
@njit
def function_ori_code(points, rot_matrix):
    points_t = points.T 
    points_rotated = np.empty((points_t.shape))
    for i in range(points.shape[0]):
        points_rotated[:, i] = np.dot(rot_matrix, points_t[:, i])    
    return points_rotated.T 


%timeit function_matmul(points, rot_matrix)
16.5 ms ± 665 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit function_numba_dot(points, rot_matrix)
16.3 ms ± 69.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit function_ori_code(points, rot_matrix)
260 ms ± 4.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)