Home > Mobile >  Finding the two clusters with smallest distance between a list of clusters
Finding the two clusters with smallest distance between a list of clusters

Time:06-28

I have a list of solution clusters in the following format:

Input:

test = []

# solutions, centroid
test.append([[3,5],4])
test.append([[2,8],5])
test.append([[1,3],2])
test.append([[5,9],7])

Out: [[[3, 5], 4], [[2, 8], 5], [[1, 3], 2], [[5, 9], 7]]

How would I return the union of the 2 clusters with the smallest centroid distance out of all clusters?

Goal Output: [[2,8,3,5],4.5]

The order of the solutions in the union cluster is not relevant.

I have spent a considerable amount of time trying to come up with a solution with multiple loops to no avail.

CodePudding user response:

You can do with two for-loop:

res = []
for t1 in test:
    for t2 in test:
        if t1 != t2:
            res.append([t1[0] t2[0], (t1[1] t2[1])/2])
  
# Or as one-line          
res = [[(t1[0]  t2[0]), (t1[1] t2[1])/2] for t1 in test for t2 in test if t1 != t2]

Output:

>>> sorted(res, key=lambda x: x[1])[0]
[[3, 5, 1, 3], 3.0]

>>> sorted(res, key=lambda x: x[1])
[[[3, 5, 1, 3], 3.0],
 [[1, 3, 3, 5], 3.0],
 [[2, 8, 1, 3], 3.5],
 [[1, 3, 2, 8], 3.5],
 [[3, 5, 2, 8], 4.5],
 [[2, 8, 3, 5], 4.5],
 [[1, 3, 5, 9], 4.5],
 [[5, 9, 1, 3], 4.5],
 [[3, 5, 5, 9], 5.5],
 [[5, 9, 3, 5], 5.5],
 [[2, 8, 5, 9], 6.0],
 [[5, 9, 2, 8], 6.0]]

CodePudding user response:

You can calculate all distances (essentially all combinations 2 by 2) and sort them by smallest distance, them join the lists (the code below assumes test as you defined in your question):

from itertools import combinations

best_pair = sorted([(abs(el1[1] - el2[1]), el1, el2) for el1, el2 in combinations(test, 2)])[0]
result = [best_pair[1][0]   best_pair[2][0], abs(best_pair[1][1]   best_pair[2][1])/2]

There are optimisations that can be made if you have too many clusters, as the algorithm above is quadratic in time and memory.

  • Related