Skiena's Algorithm Design Manual (3 ed, p. 204) refers to adjacency lists as opposed to general adjacency representations, defining them as assigning to each vertex a
a singly linked list L_a
with underlying set set(L_a) = {b | (x, b) <- edges, x == a}
.
I'm surprised that Skiena presents the singly linked list as the definitive data structure implementing the collections L_a
. My impression is that linked lists are generally losing favor compared with arrays and hash tables, because:
- They are not cache-friendly to iterate over (like arrays are), and the gap between processor speed and main memory access has become more important. (For instance this video (7m) by Stroustrup.)
- They don't bring much to the table, particularly when order isn't important. The advantage of linked lists over arrays is that they admit constant-time add and delete. But in the case where we don't care about order, these can be constant-time operations on arrays as well, using "swap and pop" for deletes. A hash table would have the additional advantage of constant-time search. My understanding is that hash tables cost more memory than a linked list or an array, but this consideration has become relatively less important. (Perhaps this claim isn't meaningful in the absence of a specific application.)
Other sources treat adjacency lists differently. For instance Wikipedia presents an implementation where the L_a
are arrays. And in Stone's Algorithms for Functional Programming the L_a
are unordered sets, implemented ultimately as Scheme lists (which in turn struck me as strange).
My Question: Is there a consideration I'm missing which gives singly linked lists a significant advantage in adjacency representations?
I add an earnest request that before you vote to close this question, or post a comment with an uncharitable tone, you ask yourself whether you are really helping this site achieve its goals by doing so.
CodePudding user response:
I don't think there's any general agreement on singly-linked lists as the default representation of adjacency lists in most real-world use cases.
A singly-linked list is, however, pretty much the most restrictive implementation of an adjacency list you could have, so in a book about "Algorithm Design", it makes sense to think of adjacency lists in this representation unless you need something special from them, like random access, bidirectional iteration, binary search, etc.
When it comes to practical implementations of algorithms on explicit graphs (most implementations are on implicit graphs), I don't think singly-linked lists are usually a good choice.
My go-to adjacency list graph representation is a pair of parallel arrays:
- Vertexes are numbered from 0 to n-1
- There is an edge array that contains all of the edges sorted by their source vertex number. For an undirected graph, each edge appears in here twice. The source vertices often don't need to be stored in here.
- There is a vertex array that stores, for each vertex, the end position of its edges in the edge array.
This is a nice, compact, cache-friendly representation that is easy to work with and requires only two allocations.
I can usually find an easy way to construct a graph like this in linear time by first filling the vertex array with counts, then changing the counts to start positions (shifted cumulative sum), and then populating the edge array, advancing those positions as edges are added.