Home > Software engineering >  How to build a Minimum Spanning Tree given a list of 200 000 nodes?
How to build a Minimum Spanning Tree given a list of 200 000 nodes?

Time:11-23

Problem

I have a list of approximatly 200000 nodes that represent lat/lon position in a city and I have to compute the Minimum Spanning Tree. I know that I need to use Prim algorithm but first of all I need a connected graph. (We can assume that those nodes are in a Euclidian plan)

To build this connected graph I thought firstly to compute the complete graph but (205000*(205000-1)/2 is around 19 billions edges and I can't handle that.

Options

Then I came across to Delaunay triangulation: with the fact that if I build this "Delauney graph", it contains a sub graph that is the Minimum Spanning Tree according and I have a total of around 600000 edges according to Wikipedia [..]it has at most 3n-6 edges. So it may be a good starting point for a Minimum Spanning Tree algorithm.

Another options is to somewhat build an approximate connected graph but with that I will maybe miss important edges that will influence my Minimum Spanning Tree.

My question

Is Delaunay a reliable solution in this case? If so, is there any other reliable solution than delaunay triangulation to this problem ?

Further information: this problem has to be solved in C.

CodePudding user response:

The Delaunay triangulation of a point set is always a superset of the EMST of these points. So it is absolutely "reliable"*. And recommended, as it has a size linear in the number of points and can be efficiently built.

*When there are cocircular point quadruples, neither the triangulation nor the EMST are uniquely defined, but this is usually harmless.

CodePudding user response:

This is more "thoughts" than "answer" but it doesn't fit in a comment so oh well.

There's a big question here of what libraries you have access to and how much you trust yourself as a coder. (I'm assuming the fact that you're new on SO should not be taken as a measure of your overall experience as a programmer - if it is, well, RIP.)

If we assume you don't have access to Delaunay and can't implement it yourself, minimum spanning trees algorithms that pre-suppose a graph aren't necessarily off limits to you. You can have the complete graph conceptually but not actually. Kruskal's algorithm, for instance, assumes you have a sorted list of all edges in your graph; most of your edges will not be near the minimum, and you do not have to compare all n^2 to find the minimum.

You can find minimum edges quickly by estimations that give you a reduced set, then refinement. For instance, if you divide your graph into a 100/100 grid, for any point p in the graph, points in the same grid square as p are guaranteed to be closer than points three or more squares away. This gives a much smaller set of points you have to compare to safely know you've found the closest.

It still won't be easy, but that might be easier than Delaunay.

  • Related