Home > Enterprise >  Finding cycle in a directed graph in which at least one vertex is a neighbor of a specific vertex (n
Finding cycle in a directed graph in which at least one vertex is a neighbor of a specific vertex (n

Time:01-10

I'm trying to count all possible paths from vertex 0 to vertex 1 in a directed graph. I have already done the algorithm that contains acyclic graphs, but I need to check wheter there is a cycle or not, because if there is, then we might have infinte paths in the graph. (So the paths don't have to be simple paths). Is it optimal to run DFS only for the vertex's 1 neighbors trying to find the cycle and then check if at least one of the vertexes in cycle has an edge leading to the vertex 2?

I'm thinking about something like that :

for every neighbor of vertex 1
    run DFS
if there is a cycle
   for every vertex in cycle
       find if it has an edge leading to vertex 2
else
   run second dfs to count the simple paths

At the moment I have already done the second dfs which basically finds all simple paths when there is no cycle in the graph.

CodePudding user response:

To detect if cycles are present in a graph, modify a depth first search ( DFS ) as follows:

- run DFS 
    - IF a node is reached for the second time 
       - IF path exists from node reached again to current DFS node
          - the path is a cycle

This answer ( https://stackoverflow.com/a/75036736/16582 ) shows the C code to do this.

It seems that you MUST have a direct connection between vertex 1 and ANY of the vertices in the cycle found. If that connection does not exist the cycle can be ignored.

So, run the cycle finding algorithm, to get a vector of cycles. Loop over the cycles, discarding any that do not meet your constraint.

CodePudding user response:

I think you don't need a separate cycle detection. Cycle detection could happen automatically, as you proceed with the main path-counting search. The idea is to do a BFS.

The task can be worded like this:

  1. There's directed graph G, comprised of nodes {Ni} and edges {Ej}.
  2. G could have anything: loops, disconnected groups, orphan nodes, etc.
  3. One node is called "initial" and another node is called "final".
  4. The goal is to calculate the number of unique trajectories from node Ninitial to node Nfinal.

DFS should be avoided for its complexity. Just think about diamond graphs, which could lead to exponential DFS complexity:

       N2      N5      N8
     /    \  /    \  /    \    ....
-- N1      N4      N7      N10 
     \    /  \    /  \    /    ....
       N3      N6      N9

The reasoning for the algorithm is:

  1. Let Pi be the number of trajectories from node Ninitial to node Ni.
  2. Let 'Lj' be the number of trajectories that include edge Ej.
  3. For any edge Ej, its Lj equals to Pi of the starting node of that edge.
  4. For any node 'Ni', its 'Pi' equals to the sum of Lj of all incoming edges.
  5. Let's call a node 'complete' if 'Lj' for all its incoming edges are already known.
  6. Let's assume that initial node is 'complete' with the starting condition Pinitial = 1.

The algorithm itself is:

  1. Set up a queue of node references. BFS will pick nodes from that queue one by one and process them.
  2. Processing some node Ni means propagating its Pi to all its outgoing edges. As the result, some of the immediate neighbour nodes down the graph could become 'complete'.
  3. If, upon processing some node, some other node(s) becomes 'complete', that that other node(s) should be posted to the queue.

That's all.

In an acyclic graph, the above algorithm would iterate all nodes and calculate Pi for all reachable nodes, including Pfinal.

In a graph with cycles, only acyclic part of the graph will be iterated, because any cycle wouldn't allow its edges (and therefore nodes) to become 'complete'. So, if upon completion of the BFS you see that some nodes have been visited, but are still 'incomplete' -- you know you have cycles (or unreachanble incoming edges).

Minimal viable example is:

#include <cstddef>
#include <iostream>
#include <stdexcept>
#include <utility>
#include <vector>


// -------------------------------------------------------------
//              Problem
// -------------------------------------------------------------
using NodeIndex = size_t;

struct Edge
{
    // An directed edge is defined by two nodes: from and to.
    NodeIndex   from;
    NodeIndex   to;
};

class Graph
{
public:
    using Edges = std::vector<Edge>;

    Graph(NodeIndex numNodes, Edges edges)
     : m_numNodes(numNodes)
     , m_edges(std::move(edges))
    {
    }

    NodeIndex GetNumNodes() const
    {
        return m_numNodes;
    }

    const Edges& GetEdges() const
    {
        return m_edges;
    }

private:
    NodeIndex   m_numNodes;
    Edges       m_edges;
};


// -------------------------------------------------------------
//              Solver API
// -------------------------------------------------------------
// Number of paths could grow exponentially with the size of the graph.
// Consider using some library capable of handling arbitrary large integers.
using BigInt = size_t;

BigInt
CountPaths(const Graph& graph, NodeIndex initialNode, NodeIndex finalNode);


// -------------------------------------------------------------
//              Solver implementation
// -------------------------------------------------------------
struct Node
{
    // A node is considered 'complete' if the number of paths from the starting node to this node is already known.
    // Otherwise the node is considered 'incomplete'.

    using EdgeIndex = NodeIndex;
    using ChildIndex = NodeIndex;

    EdgeIndex           m_pending = 0;          // Number of incoming edges from incomplete nodes.
    ChildIndex          m_numChildren = 0;      // Number of outgoing edges, same as the number of child nodes.
    ChildIndex          m_firstLink = 0;        // Index of the first child link.
    ChildIndex          m_lastLink = 0;         // Index of the last child link.
    BigInt              m_numPaths = 0;         // Number of trajectories from the starting node to this node.
};

BigInt
CountPaths(const Graph& graph, NodeIndex initialNode, NodeIndex finalNode)
{
    using EdgeIndex = Node::EdgeIndex;
    using ChildIndex = Node::ChildIndex;
    using Nodes = std::vector<Node>;
    using IndexVector = std::vector<NodeIndex>;

    NodeIndex           numNodes = graph.GetNumNodes();
    EdgeIndex           numEdges = graph.GetEdges().size();
    Nodes               nodes(numNodes);
    IndexVector         queue(numNodes);
    IndexVector         childLinks(numEdges);

    // Calculate the number of incoming ('pending') edges
    // and the number of outgoing ('child') edges for each node.
    // O(E).
    for (const Edge& edge : graph.GetEdges())
    {
          nodes[edge.from].m_numChildren;
          nodes[edge.to].m_pending;
    }

    // Allocate chunks of 'childLinks' array for child nodes of each node.
    // O(N).
    ChildIndex          offset = 0;
    for (Node& node : nodes)
    {
        node.m_firstLink = offset;
        node.m_lastLink = offset;
        offset  = node.m_numChildren;
    }

    // Initialize child links for each node.
    // O(E).
    for (const Edge& edge : graph.GetEdges())
    {
        childLinks[nodes[edge.from].m_lastLink  ] = edge.to;
    }

    // Initialize the queue of nodes for BFS.
    size_t      queueTop = 0;

    // Post the starting node to the queue.
    nodes[initialNode].m_numPaths = 1;  // There's only one way to reach the starting node.
    nodes[initialNode].m_pending  = 1;  // Prevent the starting node from being posted again. Overflow arelt, though.
    queue[queueTop  ] = initialNode;

    EdgeIndex   initialNodePending = nodes[initialNode].m_pending;
    EdgeIndex   finalNodePending = nodes[finalNode].m_pending;

    if (finalNodePending == 0)
    {
        throw std::runtime_error("Final node is unreachable because it has no incoming edges.");
    }

    // Every node that has no incoming edges can be posted to the queue right now.
    // Such nodes are unreachabe, so they contribute 0 to the number of possible paths.
    // Still they should be processed, otherwise some edges could remain "pending" forever.
    // O(N).
    for (NodeIndex i = 0; i < numNodes;   i)
    {
        if (nodes[i].m_pending == 0)
            queue[queueTop  ] = i;
    }

    // Main BFS loop.
    // O(N) * O(E), worst case.
    for (size_t currentIndex = 0; currentIndex < queueTop;   currentIndex)
    {
        // Pick the first node in the queue.
        const Node&     current = nodes[queue[currentIndex]];

        // We already know how many paths lead to this node.
        std::cout << "Processing node " << queue[currentIndex] << " with path count = " << current.m_numPaths << std::endl;

        // Follow each edge from the current node to the next 'child' node.
        for (NodeIndex linkIndex = current.m_firstLink; linkIndex < current.m_lastLink;   linkIndex)
        {
            NodeIndex   childIndex = childLinks[linkIndex];
            Node&       child = nodes[childIndex];
            std::cout << "  Child node " << childIndex << " path count = " << child.m_numPaths << " " << current.m_numPaths << std::endl;
            child.m_numPaths  = current.m_numPaths;
            --child.m_pending;

            // If all incoming edges to this child have been accrued,
            // then this child node becomes 'complete' and is ready to be propagated.
            if (child.m_pending == 0)
            {
                // Post it to the queue.
                std::cout << "  Child node " << childIndex << " is ready" << std::endl;
                queue[queueTop  ] = childIndex;
            }
        }

        // Visual sugar
        std::cout << std::endl;
    }

    if (initialNodePending != nodes[initialNode].m_pending)
    {
        throw std::runtime_error("Initial node has been visited again. It's part of some loop!");
    }

    // See if all ways to reach the final node have been explored, then the answer is ready.
    if (nodes[finalNode].m_pending == 0)
    {
        return nodes[finalNode].m_numPaths;
    }

    // At this point there could be some loops.
    // But it's not clear if they affect the result.
    nodes[initialNode].m_pending -= 1;

    // Find all blocking nodes and post them to the queue.
    // O(N).
    queueTop = 0;
    for (NodeIndex i = 0; i < numNodes;   i)
    {
        if ((nodes[i].m_pending != 0) && (nodes[i].m_numPaths != 0))
        {
            std::cout << "Node " << i << " can be reached but it's still incomplete. It might belong to some loop." << std::endl;
            queue[queueTop  ] = i;
            nodes[i].m_pending = 0;
        }
    }

    initialNodePending = nodes[initialNode].m_pending;

    // Let's do BFS flood fill ignoring any possible loop.
    for (size_t currentIndex = 0; currentIndex < queueTop;   currentIndex)
    {
        const Node&     current = nodes[queue[currentIndex]];
        std::cout << "Flood filling from node " << queue[currentIndex] << std::endl;
        for (NodeIndex linkIndex = current.m_firstLink; linkIndex < current.m_lastLink;   linkIndex)
        {
            NodeIndex   childIndex = childLinks[linkIndex];
            Node&       child = nodes[childIndex];
            if (child.m_pending != 0)
            {
                std::cout << "  Next node is " << childIndex << std::endl;
                queue[queueTop  ] = childIndex;
                child.m_pending = 0;
            }
        }
    }

    // See if the final node has been reached after flood-filling through all loops.
    if (nodes[finalNode].m_pending == 0)
    {
        // Yes, the final node is behind some loop(s).
        throw std::runtime_error("There are loops, and they stand in the way!");
    }

    if (initialNodePending != nodes[initialNode].m_pending)
    {
        throw std::runtime_error("Initial node has been visited again. It's part of some loop-2!");
    }

    // There are loops, but they don't affect the result.
    return nodes[finalNode].m_numPaths;
}


// -------------------------------------------------------------
//              Example
// -------------------------------------------------------------
Graph
InitGraph()
{
    using Edges = Graph::Edges;

    NodeIndex   numNodes = 10;
    Edges       edges;

    edges.push_back({0, 1});
    edges.push_back({0, 2});
    edges.push_back({0, 3});

    edges.push_back({1, 8});

    edges.push_back({3, 4});
    edges.push_back({3, 5});

    edges.push_back({4, 6});
    edges.push_back({5, 6});

    edges.push_back({2, 7});
    edges.push_back({6, 7});

    edges.push_back({7, 8});

    edges.push_back({7, 9});
    edges.push_back({8, 9});

// Loop!
//    edges.push_back({8, 2});

    return Graph(numNodes, edges);
}

int main()
{
    Graph       graph = InitGraph();
    NodeIndex   initialNode = 0;
    NodeIndex   finalNode = 9;

    try
    {
        BigInt      numPaths = CountPaths(graph, initialNode, finalNode);
        std::cout << "Number of paths from " << initialNode << " to " << finalNode << " = " << numPaths << std::endl;
    }
    catch (const std::exception& e)
    {
        std::cout << e.what() << std::endl;
    }

    return 0;
}
  • Related