Home > front end >  Classifying edges in DFS on a directed graph
Classifying edges in DFS on a directed graph

Time:10-08

Based on a DFS traversal, I want to classify edges (u, v) in a directed graph as:

  • Tree edge: when v is visited for the first time as we traverse the edge
  • Back edge: when v is an ancestor of u in the traversal tree
  • Forward edge: when v is a descendant of u in the traversal tree
  • Cross edge: when v is neither an ancestor or descendant of u in the traversal tree

I was following a enter image description here

which is represented with:

self.v = 3
self.graph_list = [[1, 2], [], [1]]

The above code is not identifying the edge (2, 1) as a cross edge, but as a back edge.

I have no clue what to adapt in my code in order to detect cross edges correctly.

In a discussion someone gave this information, but I couldn't make work:

The checking condition is wrong when the node has not been completely visited when the edge is classified. This is because in the initial state the start & end times are set to 0.

if the graph looks like this:

  • 0 --> 1
  • 1 --> 2
  • 2 --> 3
  • 3 --> 1

When checking the 3 --> 1 edge: the answer should be a back edge.

But now the start/end [3] = 4/0 ; start/end [1] = 1/0

and the condition end[3] < end[1] is false because of the intialization problem.

I see two solutions,

  1. traverse the graph first and determine the correct start/end [i] values, but it needs more time complexity, or
  2. use black/gray/white and discover the order to classify the edges

CodePudding user response:

Here are some issues:

  • By initialising start_time and end_time to 0 for each node, you cannot make the difference with a real time of 0, which is assigned to the very first node's start time. You should initialise these lists with a value that indicates there was no start/end at all. You could use the value -1 for this purpose.

  • The following statements should not be inside the loop:

        self.end_time[node] = self.time
        self.time  = 1
    

    They should be executed after the loop has completed. Only at that point you can "end" the visit of the current node. So the indentation of these two statements should be less.

  • There are several places where the value of self.end_time[node] is compared in a condition, but that time has not been set yet (apart from its default value), so this condition makes little sense.

  • The last elif should really be an else because there are no other possibilities to cover. If ever the execution gets there, it means no other possibility remains, so no condition should be checked.

  • The condition self.start_time[node] > self.start_time[neighbour] is not strong enough for identifying a back edge, and as already said, the second part of that if condition makes no sense, since self.end_time[node] has not been given a non-default value yet. And so this if block is entered also when it is not a back edge. What you really want to test here, is that the visit of neighbor has not been closed yet. In other words, you should check that self.start_time[neighbor] is still at its default value (and I propose to use -1 for that).

Not a problem, but there are also these remarks to make:

  • when you keep track of start_time and end_time, there is no need to have visited. Whether a node is visited follows from the value of start_time: if it still has its default value (-1), then the node has not yet been visited.

  • Don't use code comments to state the obvious. For instance the comment "increment the time by 1" really isn't explaining anything that cannot be seen directly from the code.

  • Attribute v could use a better name. Although V is often used to denote the set of nodes of a graph, it is not intuitive to see v as the number of nodes in the graph. I would suggest using num_nodes instead. It makes the code more readable.

Here is a correction of your code:

class Graph:
    def __init__(self, num_nodes):
        self.time = 0
        self.traversal_array = []
        self.num_nodes = num_nodes  # Use more descriptive name
        self.graph_list = [[] for _ in range(num_nodes)]
 
    def dfs(self):
        self.start_time = [-1]*self.num_nodes
        self.end_time = [-1]*self.num_nodes
        for node in range(self.num_nodes):
            if self.start_time[node] == -1:  # No need for self.visited
                self.traverse_dfs(node)

    def traverse_dfs(self, node):
        self.traversal_array.append(node)
        self.start_time[node] = self.time
        self.time  = 1
        for neighbour in self.graph_list[node]:
            # when the neighbor was not yet visited
            if self.start_time[neighbour] == -1:
                print(f"Tree Edge: {node}-->{neighbour}")
                self.traverse_dfs(neighbour)
            # otherwise when the neighbour's visit is still ongoing:
            elif self.end_time[neighbour] == -1:
                print(f"Back Edge: {node}-->{neighbour}")
            # otherwise when the neighbour's visit started before the current node's visit:
            elif self.start_time[node] < self.start_time[neighbour]:
                print(f"Forward Edge: {node}-->{neighbour}")
            else: # No condition here: there are no other options
                print(f"Cross Edge: {node}-->{neighbour}")
        # Indentation corrected:
        self.end_time[node] = self.time
        self.time  = 1
  • Related