I am trying to calculate the algorithmic complexity of Kahn's algorithm implemented in python, I came across this article :https://www.geeksforgeeks.org/topological-sorting-indegree-based-solution/ in the part of the code for calculating the degree of all nodes has this
for each node in Nodes
If (list[node].size()!=0) then
for each dest in list
indegree[dest] ;
in the article it says that the complexity is O(V E), but why? there are two nested fors, shouldn't it be O(V*E) ?
I wrote an implementation in python something like this:
for vertex in graph: # O(V) * O(E) = O(V * E). ??
for edge in graph[vertex]: # O(E) O(1) => O(E)
indegree[dege] = 1 # O(1)
V = # of vertex of the graph
E = # of edges
CodePudding user response:
This is notation often used in graph algorithms to specify how dependent the algorithm is on the density of the graph. E
is always O(V^2)
, but that bound may not be tight. Say there aren't many edges (for example, a tree, where there are exactly V - 1
edges). Then O(V E) = O(V V) = O(2V) = O(V)
.
Now consider a dense graph (for example, a complete graph, where there are at least V(V-1)
edges, or more if self edges are allowed). Now O(V E) = O(V V^2 - V) = O(V^2)
.
Inside O(...)
,
behaves somewhat like max
: the overall complexity is determined by the larger of the two addends. This basically gives you a kind of shorthand for showing how the complexity grows along two axes: one, the number of vertices, and two, the density of the edges.
CodePudding user response:
Specifically, the complexity would be more accurately stated as V 2E
(which has overall complexity class O(V E)
. We iterate over each vertex exactly once, and each edge exactly twice (once for each vertex it's attached to) - assuming that graph[vertex]
returns only the edges attached to that vertex, which seems to be the algorithm's intention.
Nested loops does usually indicate a *
instead of a
, where complexity is concerned. If the inner loop is iterating over something different than the outer loop, however, we sometimes get cases like this - where there's a specific and constant number of things across all inner loops, distributed somehow across each element of the outer loops.
In such a case, we can indeed write the complexity as O(V E)
, because the number of E
we iterate over isn't tied in any way to the number of V
we iterate over - just the order in which we do so, which complexity doesn't really care about.