how can I speed up the calculations for this equation using GPU or cuda as the file contains 30.000 points
points = pd.read_csv('file.dat', sep='\t', usecols=[0, 1])
d = pd.DataFrame(np.zeros((max_id, max_id)))
dis = sch.distance.pdist(points, 'euclidean')
n = 0
for i in range(max_id):
print(i)
for j in range(i 1, max_id):
d.at[i, j] = dis[n]
d.at[j, i] = d.at[i, j]
n = 1
EDIT
i tried
points = genfromtxt(path, delimiter='\t', usecols=[0, 1])
points =torch.tensor(points)
d = pd.DataFrame(np.zeros((max_id, max_id)))
dis = torch.cdist(points)
but got
TypeError: cdist() missing 1 required positional argument: 'x2'
does that mean I need to read points or the two columns of points separately?
CodePudding user response:
NumPy doesn't natively support GPUs. Though you can use some libraries which are friendly with numpy
and supports GPU. One of the option like that would be to use PyTorch
. torch.cdist
would be one of the function you can look at (Then you don't need to organize it like that using for loops). Also there is torch.nn.functional.pdist
. Note also that, you don't need to use for loop in second case. Once you get the output, you can just reshape
it as per your need.
CodePudding user response:
Sorry, numpy doesn't support GPU, but it's not impossible to speed up your program. It seems that you want to write the calculated distance value to the non diagonal position. I use the pure numpy array to show you how to speed up your program:
>>> max_id = 5
>>> d = np.zeros((max_id, max_id))
>>> dis = np.arange(max_id // 2 * (max_id - 1)) ** 2 # set the value at will
>>> d[np.triu_indices(max_id, k=1)] = dis
>>> d d.T # d = d.T is better
array([[ 0., 0., 1., 4., 9.],
[ 0., 0., 16., 25., 36.],
[ 1., 16., 0., 49., 64.],
[ 4., 25., 49., 0., 81.],
[ 9., 36., 64., 81., 0.]])
After testing, when the max_id
is 500, the accelerated program is more than ten times faster than the previous one:
>>> def loop(max_id):
... d = np.zeros((max_id, max_id))
... dis = np.arange(max_id // 2 * (max_id - 1)) ** 2
... n = 0
... for i in range(max_id):
... for j in range(i 1, max_id):
... d[i, j] = d[j, i] = dis[n]
... n = 1
...
>>> def triu(max_id):
... d = np.zeros((max_id, max_id))
... dis = np.arange(max_id // 2 * (max_id - 1)) ** 2
... d[np.triu_indices(max_id, k=1)] = dis
... d = d.T
...
>>> timeit(lambda: loop(500), number=10)
0.546407600006205
>>> timeit(lambda: triu(500), number=10)
0.0350386000063736
If you want to do more complex loop operations, you can learn about numba
.
Update:
It seems that the size of non diagonal block is larger than size of dis
, you just need to slice the result of triu_indices
:
dis_size = dis.size
i, j = np.triu_indices(max_id, k=1)
d[i[:dis_size], j[:dis_size]] = dis