Dear(est) Stack Exchangers,
I am currently implementing some algorithms which require access to a data structure of a “queue” (FIFO). I am using the ST monad , and thus am looking for queue implementations which complement well with the ST monad’s "mutablity of memory”. At this point, I am just tempted to use newSTRef
on a list (but again, accessing the last element is O(n) complexity, which I would want to avoid as much as I can). I also thought to use Data.Sequence, though I am not sure if it actually will be “mutable” if used inside an ST monad without newSTRef
initialisation.
Can the kind members of Stack Exchange guide a beginner in Haskell as to what would be the best data structure (or module) in the aforementioned context?
CodePudding user response:
Options include implementing a traditional ring buffer on top of STArray
, or using a mutable singly-linked list built out of STRef
s, as in:
type CellRef s a = STRef s (Cell s a)
data Cell s a = End | Cell a (CellRef s a)
data Q s a = Q { readHead, writeHead :: CellRef s a }
If you want the easy growth of Q
but like the low pointer overhead of a ring buffer, you can get a middle ground by making each cell have an STArray
that slowly fills up. When it's full, allocate a fresh cell; when reading from it empties it, advance to the next cell. You get the idea.
CodePudding user response:
There is a standard implementation of a FIFO queue as two LIFO stacks, one containing items starting from the front of the queue (with the next item to be removed on top), and the other containing items starting from the back (with the most recently pushed item on top). When popping from the queue, if the front stack is empty, you replace it with the reversal of the back stack.
If both stacks are implemented as Haskell lists, then adding a value to the queue is O(1), and removing a value is amortized O(1) if the data structure is used in a single-threaded way. The constant factor isn't bad. You can put the whole data structure in a STRef (which guarantees single-threaded use). The implementation is just a few lines of code. You should definitely do this in preference to your O(n) single-list idea.
You can also use Data.Sequence
. Like the two-stack queue, it is a pure functional data structure, i.e., operations on it return a new data structure and leave the old one unchanged. But, like the two-stack queue, you can make it mutable by simply writing the new data structure into the STRef that held the old one. The constant factor for Data.Sequence
is probably a bit worse than the two-stack queue, but in exchange you get a larger set of efficient operations.
The mutable list in David Wagner's answer is likely to be less efficient because it requires two heap objects per item in the queue. You may be able to avoid that in GHC by writing
Cell a {-# UNPACK #-} !(CellRef s a)
in place of Cell a (CellRef s a)
. I'm not certain that that will work, though. If it does, this is likely to be somewhat faster than the other list-based approaches.