Home > Blockchain >  Haskell's Data.Text: Good to use as basis for text editor?
Haskell's Data.Text: Good to use as basis for text editor?

Time:12-19

An imperative programmer for a long time, every so often I look back in on Haskell and play a little more and learn a little more.

A question arose when thinking about a possible project:

How to implement data that I explicitly want to change in a language that treats data as immutable?

A specific case is the text that is edited by a text editor. Data.Text is available but it says things like appending a character to the end of a text involves copying the entire text over. Because of things like that, I'm wondering if Data.Text is the appropriate structure to use to implement text who's purpose is to change.

Is there a generalized thinking that addresses this sort of thing?

Over the years, I've written two implementations of text machinery in C#. One used a linked list of blocks of 256 (or 512, I forget, it's been a while) characters, similar to what's described in the Sam text editor. The other is a slightly modified version of a design done by Niklaus Wirth (who got it from someone else) in the Oberon System where text is implemented by two files (one for the original text, the other for newly entered data) and a linked list of pieces that is used to assemble and edit the text. I used two .NET StringBuilders instead of files, only append to them, and the whole things performs much better that just using StringBuilders as the text itself.

Note: I have a reasonable working knowledge of laziness, strictness, tail-recursion, thunks. Fusion is less clear to me but I've read a little on it.

I have a good bit of experience with SQL so I don't have a problem with a compiler doing things I don't fully understand, but in that language I know how to conceptualize the problem better than I do in Hasell.

CodePudding user response:

The standard reference for editor implementation in Haskell is probably the Yi editor. Its author(s) wrote some papers discussing this, e.g.:

  • “Yi: An Editor for Haskell in Haskell” (DOI, PDF)

  • “Lazy Functional Incremental Parsing” (DOI, PDF)

Like many text editors, Yi uses a rope as the representation of text buffers. Specifically, it’s a purely functional rope called Yi.Rope.YiString containing chunks of Text, defined as a specialisation of Data.FingerTree.FingerTree, the same data structure underlying Data.Sequence.Seq. There are further optimisations such as caching of indices into the text and batching of operations on the buffer, but the core is just a persistent tree of Unicode text chunks.

Using a persistent data structure incurs a logarithmic time cost, but makes certain features (such as cached history and incremental computation) simpler to implement correctly.

  • Related