Consider the following two implementations of dijkstra's algorithm:
Using set:
// V - total number of vertices
// S - source node
void dijkstra(int V, vector<vector<int>> edge[], int S)
{
vector<int> dist(V, 1e9);
dist[S] = 0;
set<pair<int, int>> s;
for (int i=0; i<V; i )
s.insert({dist[i], i}); // log(V) time
while (!s.empty()) // exactly V times
{
auto top = *(s.begin()); // constant time
int dis = top.first;
int node = top.second;
s.erase(top); // amortised constant time
for (auto it: edge[node]) // For all the iterations of outer while loop this will sum up to E, where E is the total number of edges in the graph
{
int nb = it[0];
int edge_weight = it[1];
if (dist[nb] > dis edge_weight)
{
s.erase({dist[nb], nb}); // log(V) time
dist[nb] = dis edge_weight;
s.insert({dist[nb], nb}); // log(V) time
}
}
}
}
Using priority queue:
// V - total number of vertices
// S - source node
void dijkstra(int V, vector<vector<int>> edge[], int S)
{
vector<int> dist(V, 1e9);
dist[S] = 0;
priority_queue<pair<int, int> , vector<pair<int, int>>, greater<pair<int, int>>> pq;
pq.push({dist[S], S});
while (!pq.empty()) // Can be more than V times, let's call this heap_size times
{
int node = pq.top().second; // O(1) time
pq.pop(); // log(heap_size) time
for (int i=0; i<edge[node].size(); i ) // Can be approximated to (E/V), where E is the total number of edges in the graph
{
int nb = edge[node][i][0];
int edge_weight = edge[node][i][1];
if (dist[nb] > dist[node] edge_weight) {
dist[nb] = dist[node] edge_weight;
pq.push({dist[nb], nb}); // log(heap_size) time
}
}
}
return dist;
}
Finding the time complexity using set based approach is easy as the number of elements in set is exactly V
(number of vertices) and the inner for loop runs for every edge, so it's time complexity is O(V*log(V) V E*log(V))
which is equivalent to O(E*log(V))
(reason: What's the time complexity of Dijkstra's Algorithm)
But I have trouble in understanding the time complexity of the priority_queue approach. Here the same node can be present multiple times in the priority_queue with different distances. How do I calculate the upper bound on the number of nodes that are added to the heap?
I also want to decide which implementation to use based on the nature of the graph(sparse vs dense), are both these implementations equivalent for any type of graph?
CodePudding user response:
Your priority_queue
version isn't quite right.
Your while loop should start like this:
auto nextPair = pq.top();
pq.pop();
if (dist[nextPair.second] != nextPair.first) {
continue;
}
int node = nextPair.second;
This ensures that each vertex is processed only once, when its current record is popped from the priority queue.
The complexity analysis then becomes easy, since each edge will then be processed at most once, and there are then at most |E| inserts into the priority queue.
Total complexity is then O(E log E), and since E < V2, that's the same as O(E log V).
The major disadvantage of the priority queue method is that it can consume O(E) space. This is usually OK, since it's on par with the space consumed by the graph itself. Since the priority_queue
is a lot faster than set
, the priority_queue
version is the way that it is commonly done in practice.