Topological sort complexity in linear time?-CodePudding

I am trying to calculate the algorithmic complexity of Kahn's algorithm implemented in python, I came across this article :https://www.geeksforgeeks.org/topological-sorting-indegree-based-solution/ in the part of the code for calculating the degree of all nodes has this

for each node in Nodes
    If (list[node].size()!=0) then
        for each dest in list
            indegree[dest]  ;

in the article it says that the complexity is O(V E), but why? there are two nested fors, shouldn't it be O(V*E) ?

I wrote an implementation in python something like this:

for vertex in graph:                    # O(V) * O(E) = O(V * E). ??
    for edge in graph[vertex]:          # O(E) O(1) => O(E)
        indegree[dege]  = 1             # O(1)

V = # of vertex of the graph

E = # of edges

CodePudding user response：

This is notation often used in graph algorithms to specify how dependent the algorithm is on the density of the graph. E is always O(V^2), but that bound may not be tight. Say there aren't many edges (for example, a tree, where there are exactly V - 1 edges). Then O(V E) = O(V V) = O(2V) = O(V).

Now consider a dense graph (for example, a complete graph, where there are at least V(V-1) edges, or more if self edges are allowed). Now O(V E) = O(V V^2 - V) = O(V^2).

Inside O(...), behaves somewhat like max: the overall complexity is determined by the larger of the two addends. This basically gives you a kind of shorthand for showing how the complexity grows along two axes: one, the number of vertices, and two, the density of the edges.

CodePudding user response：

Specifically, the complexity would be more accurately stated as V 2E (which has overall complexity class O(V E). We iterate over each vertex exactly once, and each edge exactly twice (once for each vertex it's attached to) - assuming that graph[vertex] returns only the edges attached to that vertex, which seems to be the algorithm's intention.

Nested loops does usually indicate a * instead of a , where complexity is concerned. If the inner loop is iterating over something different than the outer loop, however, we sometimes get cases like this - where there's a specific and constant number of things across all inner loops, distributed somehow across each element of the outer loops.

In such a case, we can indeed write the complexity as O(V E), because the number of E we iterate over isn't tied in any way to the number of V we iterate over - just the order in which we do so, which complexity doesn't really care about.