Home > Enterprise >  How to map visible text indices to their location in an HTML tree
How to map visible text indices to their location in an HTML tree

Time:01-29

I am trying to implement a web-based rich text editor that will automatically decorate the user's text while he's typing (think spellcheck).

The issue is that the server only processes raw text, and returns annotations with their index length in the raw text.

So the complete flow must look like :

  1. When spellcheck routine triggers, it converts the contents of the HTML structure into raw text.
  2. Query the server for spellcheck annotations.
  3. From the returned indices, find out the corresponding HTML portion and surround it with underline tags.

For step one I am using Rangy and especially the TextRange module. However for step 3, I can't find a proper way to convert text indices to their corresponding HTML node offset.

I'm looking for a solution that would be quite robust, that can handle unicode characters, words that are cut in middle by a tag, or any other weird HTML structure.

FYI I am using Pell rich editor but the problem is the same with any contenteditable-based editor, and if another one solves this poblem I will happily switch.

What's the best way to achieve this goal?

CodePudding user response:

Turns out I totally missed the selectCharacters() method from Rangy which solves this problem.

const content = document.getElementById("content");
const range = rangy.createRange();
// indexes are in unicode code points, not bytes
range.selectCharacters(content, /* from */ 0, /* to */ 5);

CodePudding user response:

To map visible text indices to their location in an HTML tree, you can use a technique called "text node traversal." This involves traversing the HTML tree and keeping track of the indices of the visible text as you go. Here is an example of how you might implement this in JavaScript:

Create a recursive function that takes the current node as its argument. Check if the current node is a text node. If it is, add its text to a global string and store the index of the text node in an array. If the current node has children, call the recursive function for each child node. Return the array of indices and the global string. You can then use this array of indices to determine the location of specific text within the HTML tree. For example, if the index of a certain piece of text is I, you can use that index to find the corresponding node in the HTML tree by using the array of indices.

Please be aware that this is just a simple example, and it will not take into consideration hidden text or styles that could affect the visibility of text. Also, this example is written in Javascript and you can use similar approach in other languages.

  • Related