I have a graph problem where we have:
- There are exactly n vertices and n-1 edges.
- All vertices are connected with each other – i.e. the network consists of just one connected component. The network is thus a tree.
- All edges have positive length (strictly greater than 0). All edges can carry traffic in both directions.
- I am given the shortest path distance between each pair of vertices.
More formally: Let the actual vertice network be a tree T. Given just the shortest path distances of T, you have to reconstruct the original network T.
Input: An n × n distance matrix H with Hi,j = δT (i,j), where T is the actual network of vertices and δT is the shortest path distance between vertice i and j in T.
Output: The n −1 edges of T.
Example:
•T is the actual vertice network.
•H is the n × n shortest path distance matrix.
•G(H) is the complete graph on n nodes, where edge (i,j) has weight Hi,j – i.e. the shortest path distance in T.
My question about Time Complexity:
What is the running time of the algorithm resulting from running the Prim algorithm on the input and returning the list of edges as a function of n? (Note that |E(G(H))|= Θ(n2)). Should Amortized analysis be used here? Im not really sure.
CodePudding user response:
The time complexity of Prim's algorithm using the adjacency matrix of a complete graph is Theta(n^2)
. We can see this from the pseudocode of Prim's algorithm with our adjacency matrix H
:
- Initialize a set
Q
of vertices not in the tree, initially all vertices. Choose the first vertex to be our rootR
. - Initialize two arrays of length
n
,key
andparent
.key
will store, at positioni
, the minimum weight edge connecting thei
th vertex to the current MST; initially, this is infinity.parent
will store, after the algorithm is done, the parent of each vertex in the MST rooted atR
. Initially,parent[i] = R
for alli
, except the root, which has no parent. - Loop over the first row of
H
(corresponding to the rootR
) and assignkey[i] = H[0][i]
. RemoveR
from our setQ
. - While
Q
is not empty:- Loop over
Q
, and extract any vertexu
with minimumkey[u]
- For each vertex
v
from 0 to n-1:- If
v
inQ
andH[u][v] < key[v]
:- Set
key[v]
=H[u][v]
- Set
parent[v]
=u
- Set
- If
- Loop over
Here, the loop in (4) runs n-1
times, and inside the loop, we do Theta(n)
work. In total, that gives a runtime of Theta(n^2)
, which is optimal for any algorithm that needs to read the entire adjacency matrix. In particular, for a generic complete graph, this is optimal, but this doesn't imply that Prim's algorithm is optimal for your specific case with a narrower class of graphs formed from distance matrices.
To show that your problem transformation is correct, we need to verify that, given a tree T
with positive weights, the complete graph G(H)
formed by taking the distance matrix of T
as an adjacency matrix will satisfy:
G(H)
has a unique minimum spanning treeT
is a minimum spanning tree ofG(H)
.
This requires proving several properties of minimum spanning trees in general. One theorem about minimum spanning trees, proven as Corollary 3.5 in these MIT lecture notes, says that:
Let
G = (V,E,w)
be a connected, weighted, undirected graph. LetT
be any MST and let(U, V \ U)
be any cut. ThenT
contains a light edge for the cut.
Here, a 'light edge for a cut (U, V \ U)' means an edge whose weight is the minimum weight of all edges with exactly one endpoint in U
.
Now, we just need to choose appropriate cuts to prove what we want. For an arbitrary edge e
in your original tree T
, consider the two trees T1
and T2
we get by deleting e
.
Take the vertices of T1
, which we'll call V(T1)
, as our cut. We need to show that in the complete graph G(H)
, the edge e
is the unique light edge for that cut. In our original tree T
, e
is the only edge that crosses the cut. This means that any path with one endpoint u
in V(T1)
and the other endpoint v
in V(T2)
must include e
.
Since all the weights are positive, this means that the distance in T, distance(u,v) > weight(e)
, for any u, v such that (u in V(T1), v in V(T2), and (u, v) != e)
. Since the distance in T
between u
and v
is the weight of the edge (u, v)
in our complete graph G(H)
, this means that e
is the unique minimum weight edge that crosses the cut. Since e
was an arbitrary edge in T
, this now means that all edges of T
must be in our MST for G(H)
, so the unique MST of G(H)
is T
.