Home > other >  Disjoint Set Union with directed graph
Disjoint Set Union with directed graph

Time:02-09

I understand the DSU strictly works with undirected graphs from this stack overflow question - Can we detect cycles in directed graph using Union-Find data structure?

Nevertheless, I am currently working on a problem that involves 400000 queries and a graph with at most 400000 nodes, in which there are two possible queries:

  • Connect nodes a and b (directed, of course)

  • Output "true" if node "x" is reachable from node 1. Otherwise, print "false."

However, my original instinct was to use DSU; that obviously would not work. Any suggestions? Thank you.

CodePudding user response:

What you want is a data structure for a problem called 'incremental reachability'. There are multiple ways to construct such a data structure, all have some different update/query time tradeoffs. A very simple way to achieve the goal is to use an adjacency list and use BFS every time a user queries if node "x" is reachable from 1. This gets you update time: O(1) and query time: O(m).

A more complicated idea is 'even-shiloach' trees [1], here a BFS tree is maintained efficiently. Total update time: O(nm) Query time: O(1).

An experimental analysis of similar algorithms can be found in [2].

[1] Shimon Even, Yossi Shiloach: An On-Line Edge-Deletion Problem. J. ACM 28(1): 1-4 (1981) https://dl.acm.org/doi/10.1145/322234.322235

[2] Fully Dynamic Single-Source Reachability in Practice: An Experimental Study, Kathrin Hanauer, Monika Henzinger and Christian Schulz https://arxiv.org/abs/1905.01216

CodePudding user response:

As long as you don't have to delete connections, and you're only interested in reachability from a single source, then you can do this in amortized constant time per operation pretty easily.

  • Maintain a set of nodes that are known to be reachable from node 1. Initially this would just contain node 1. We'll call these nodes 'marked'.
  • When you connect a->b, if a is marked but b is not, then mark all the newly reachable nodes. This is a transitive closure operation:
    • put b in a queue.
    • while the queue is not empty, remove a vertex, mark it reachable, and put all of its unmarked neighbors in the queue.
  •  Tags:  
  • Related