Average distance function in python-CodePudding

I am trying to solve the following question:

consider the following function

Point2D = tuple[int,int]
def average_distance(points: set[Point2D]) -> float:

that is given a set of points on the grid, where each point is specified by a pair of integer coordinates, and returns the average distance between all the pairs of distinct points in points. If there are less items in points than 2, the function raises ValueError.

Example: average_distance({(1,2), (3,4), (5,6)}) is approximately 3.7712

I have spent quite a bit of time thinking how to implement this, but it does not come easy to me. I understand that you have to calculate the distance for all possible combination, and I have found the code below which works, and i partially understand. Can someone shed some light on this problem? at least how to approach in the simpliest way possible?


import math
from itertools import combinations

def dist(p1, p2):
    (x1, y1), (x2, y2) = p1, p2
    return math.sqrt((x2 - x1)**2   (y2 - y1)**2)

x = [1, 3, 5]
y = [2, 4, 6]

points = list(zip(x,y))
distances = [dist(p1, p2) for p1, p2 in combinations(points, 2)]
avg_distance = sum(distances) / len(distances)

CodePudding user response：

The distance between two Cartesian Coordinates can be found by deriving the Pythagorean Theorem: a^2 b^2 = c^2, which can be solved for c by c = sqrt(a^2 b^2). Here, the distance is measured by changes in x and y. That's why the Python formula works well: math.sqrt((x2 - x1)**2 (y2 - y1)**2)

But that is just a single distance between two points. What if you want the average distance between all points? Well, the average is defined as the sum of all parts divided by the number of parts that were summed. In other words, you want to sum each distance, then divide by the total number of distances measured. That is what this is doing:

avg_distance = sum(distances) / len(distances)

Finally, we have distances = [dist(p1, p2) for p1, p2 in combinations(points, 2)].

The [dist(p1, p2) for p1, p2 in combinations(points, 2)] here is what is known as a Python List Comprehension. Think of it as a bit of a for-loop that iterates through all possible combinations of points, assigning their values to p1 and p2, calculating their distance with the dist function, then assigning the result to a new list, the distances variable. To produce each combination of points, you used the combinations method, providing a list of all points and the number of points within each combination (2). It seems as if your primary confusion is with the combinations method. It certainly does a lot of the work for you. I suggest reading through the itertools documentation for more details.

CodePudding user response：

Use itertools.combinations to list up all possible combination pairs. For example,

pairs = itertools.combinations([1, 2, 3], 2)  # [[1, 2], [2, 3], [3, 1])
ds = [math.abs(p, q) for p, q in pairs]. # [1, 1, 2]
ave = sum(ds) / len(pairs) # 1.333..

So is an answer

import math
import itertools

def distance(p, q):
  return math.sqrt((p[0] - q[0])**2   (p[1] - q[1])**2)

def calculate_average_distance(points):
  combinations = itertools.combinations(points, 2)
  distances = [distance(p, q) for (p, q) in combinations]
  num_combinations  len(combinations)
  average_distance = sum(distances) / num_combinations
  return average_distance

ave_d = calculate_average_distance([(1,2), (3,4), (5,6)])  # 3.7712

https://docs.python.org/3/library/itertools.html#itertools.combinations