Home > Net >  How can i enable GPU in those lines?
How can i enable GPU in those lines?

Time:05-05

how can I speed up the calculations for this equation using GPU or cuda as the file contains 30.000 points

points = pd.read_csv('file.dat', sep='\t', usecols=[0, 1])
d = pd.DataFrame(np.zeros((max_id, max_id))) 
dis = sch.distance.pdist(points, 'euclidean') 
n = 0
for i in range(max_id):
    print(i)
    for j in range(i   1, max_id):
        d.at[i, j] = dis[n]
        d.at[j, i] = d.at[i, j]
        n  = 1

EDIT

i tried

    points = genfromtxt(path, delimiter='\t', usecols=[0, 1])
    points =torch.tensor(points)
    d = pd.DataFrame(np.zeros((max_id, max_id))) 
    dis = torch.cdist(points)

but got

TypeError: cdist() missing 1 required positional argument: 'x2'

does that mean I need to read points or the two columns of points separately?

CodePudding user response:

NumPy doesn't natively support GPUs. Though you can use some libraries which are friendly with numpy and supports GPU. One of the option like that would be to use PyTorch. torch.cdist would be one of the function you can look at (Then you don't need to organize it like that using for loops). Also there is torch.nn.functional.pdist. Note also that, you don't need to use for loop in second case. Once you get the output, you can just reshape it as per your need.

CodePudding user response:

Sorry, numpy doesn't support GPU, but it's not impossible to speed up your program. It seems that you want to write the calculated distance value to the non diagonal position. I use the pure numpy array to show you how to speed up your program:

>>> max_id = 5
>>> d = np.zeros((max_id, max_id))
>>> dis = np.arange(max_id // 2 * (max_id - 1)) ** 2    # set the value at will
>>> d[np.triu_indices(max_id, k=1)] = dis
>>> d   d.T    # d  = d.T is better
array([[ 0.,  0.,  1.,  4.,  9.],
       [ 0.,  0., 16., 25., 36.],
       [ 1., 16.,  0., 49., 64.],
       [ 4., 25., 49.,  0., 81.],
       [ 9., 36., 64., 81.,  0.]])

After testing, when the max_id is 500, the accelerated program is more than ten times faster than the previous one:

>>> def loop(max_id):
...     d = np.zeros((max_id, max_id))
...     dis = np.arange(max_id // 2 * (max_id - 1)) ** 2
...     n = 0
...     for i in range(max_id):
...         for j in range(i   1, max_id):
...             d[i, j] = d[j, i] = dis[n]
...             n  = 1
...
>>> def triu(max_id):
...     d = np.zeros((max_id, max_id))
...     dis = np.arange(max_id // 2 * (max_id - 1)) ** 2
...     d[np.triu_indices(max_id, k=1)] = dis
...     d  = d.T
...
>>> timeit(lambda: loop(500), number=10)
0.546407600006205
>>> timeit(lambda: triu(500), number=10)
0.0350386000063736

If you want to do more complex loop operations, you can learn about numba.

Update:

It seems that the size of non diagonal block is larger than size of dis, you just need to slice the result of triu_indices:

dis_size = dis.size
i, j = np.triu_indices(max_id, k=1)
d[i[:dis_size], j[:dis_size]] = dis
  • Related