Home > Back-end >  Speeding up triple loop
Speeding up triple loop

Time:03-08

Initially I had the loop

import numpy

datos = numpy.random.rand(1000,17)
clusters = 250
n_variables = 17
centros = numpy.random.rand(clusters,n_variables)
desviaciones = numpy.random.rand(n_variables)
W=1
H=numpy.zeros((len(datos), clusters))
Weight = 1 / n_variables
for k in range(len(datos)):
    inpt = datos[k]
    for i in range(clusters):
        for j in range(n_variables):
            sup = centros[i][j]   W * desviaciones[j]
            inf = centros[i][j] - W * desviaciones[j]
            feat = np.array(inpt[j])
            if (feat < sup and feat > inf):
                H[k, i]  = Weight

but a triple loop can slow the process a lot. Then, I could reduce it to:

import numpy

datos = numpy.random.rand(1000,17)
clusters = 250
n_variables = 17
centros = numpy.random.rand(clusters,n_variables)
desviaciones = numpy.random.rand(n_variables)
W=1
H=numpy.zeros((len(datos), clusters))
sup = centros   W*desviaciones
inf = centros - W*desviaciones
Weight = 1 / n_variables
for k in range(len(datos)):
    inpt = datos[k]
    for i in range(clusters):
        suma = (sup[i]>inpt)&(inf[i]<inpt)
        H[k,i]=suma.sum()*Weight

so I could save a loop, but I have problems trying to reduce the others loop using numpy functions. The only left is to repeat the formula of 'suma' for each row of sup and datos. Do you know any way of doing it?

CodePudding user response:

You can reshape centros and datos to three dimensions to take advantage of broadcasting:

centros = centros[None, :, :]    # (   1, 250, 17)  
datos = datos[:, None, :]        # (1000,   1, 17)
desv = W * desviaciones
sup = centros   desv
inf = centros - desv
H = Weight * ((datos < sup) & (datos > inf)).sum(axis=2)
  • Related