I've written a method that takes in an integer "n" and creates a square matrix where the values of each element are dictated by their respective i,j indices.
When I build a small matrix 30x30 it works just fine, but when I try to do something larger like 1000x1000 it takes very long. Is there any way that I can speed it up with multiprocessing?
def createMatrix(n):
matrix = []
for j in range(1,n 1):
row = []
for i in range(1,n 1):
value = 1/(i j-1)
row.append(value)
matrix.append(row)
return np.array(matrix)
CodePudding user response:
Parallelizing two computation-bound for
loops in Python is not trivial because of GIL. The good news is that your case is perfectly vectorizeable:
def createMatrix(n):
return 1 / (np.arange(n)[None, :] np.arange(n)[:, None] 1)
Explanation:
- essentially, your formula for the matrix is
X[row][column] = 1/(row column-1)
, where rows and columns are 1-based np.arange(n)
creates a range that can be used for rows or columns[None, :]
and[:, None]
turn it into a 2d array,1 x n
orn x 1
numpy
then broadcasts dimensions, replicating row and column indexes to match dimensions - thus, implicitly tiling both inton x n
when added- since both ranges are 0-based, using
1
instead of-1
As a rule of thumb, it is almost never a good idea to use for loops on numpy arrays. A vectorized approach (i.e. matrix form computations) is orders of magnitude faster.
CodePudding user response:
Your code can be written in another style, accelerated by numba library in a parallel no python mode:
import numba as nb
@nb.njit("float64[:, ::1](int64)", parallel=True, fastmath=True)
def createMatrix(n):
matrix = np.zeros((n, n))
for j in nb.prange(1, n 1):
for i in range(1, n 1):
matrix[j - 1, i - 1] = 1 / (i j - 1)
return matrix
This solution will be faster than the Marat answer above 2 times for n = 1000
(Benchmark)
CodePudding user response:
It's not a good idea to use fors to fill a list then convert it to a matrix. the operation that you have can be vectorized with numpy from scratch. if you think that given the i,j
, M(i,j) = 1/(j i-1)
considering that both indices starts at 1.
Here's my proposal :
def createMatrix2(n):
arr =np.arange(1,n 1)
xx,yy = np.meshgrid(arr,arr)
matrix = 1/(xx yy-1)
return matrix
looking at Marat answer, I think his/her it's better, so tested the 3 methods:
EDIT: added wwii method as createMatrix4 (correcting the errors):
import numpy as np
from time import time
def createMatrix1(n):
matrix = []
for j in range(1,n 1):
row = []
for i in range(1,n 1):
value = 1/(i j-1)
row.append(value)
matrix.append(row)
return np.array(matrix)
def createMatrix2(n):
arr =np.arange(1,n 1)
xx,yy = np.meshgrid(arr,arr)
matrix = 1/(xx yy-1)
return matrix
def createMatrix3(n):
"""Marat's proposed matrix"""
return 1 / (1 np.arange(n)[None, :] np.arange(n)[:, None])
def createMatrix4(n):
""" wwii method"""
i,j = np.ogrid[1:n,1:n]
return 1/(i j-1)
#test all the three methods
n = 10000
t1 = time()
m1 = createMatrix1(n)
t2 = time()
m2 = createMatrix2(n)
t3 = time()
m3 = createMatrix3(n)
t4 = time()
m4 = createMatrix4(n)
t5 = time()
print(np.allclose(m1,m2))
print(np.allclose(m1,m3))
print(np.allclose(m1,m4))
print("Matrix 1 (OP): ",t2-t1)
print("Matrix 2: (mine)",t3-t2)
print("Matrix 3: (Marat)",t4-t3)
print("Matrix 4: (wwii)",t5-t4)
# the output is:
#True
#True
#True
#Matrix 1 (OP): 18.4886577129364
#Matrix 2: (mine) 1.005324363708496
#Matrix 3: (Marat) 0.43033909797668457
#Matrix 4: (wwii) 0.5138359069824219
So Marat's solution is faster. As general comments:
- Try to avoid fors loops
- Think your problem as operation with indices and dessing operations with numpy arrays directly.
For last, given Marat's answer I thought my proposal is a easier to read, and understand. But it's just a subjective view