Home > Back-end >  How to flatten a series of begin and end indices
How to flatten a series of begin and end indices

Time:07-22

I have been working on a project lately where I need to be able to annotate a sentence at specific indices using vanilla javascript. For example: The dog runs really fast.

The input I have to work with looks something like this:

[
     {beginIndex: 4, endIndex: 7, style: "bold"},
     {beginIndex: 5, endIndex: 7, style: "italics"},
     {beginIndex: 13, endIndex: 19, style: "italics"},
]

The challenge I'm running into is when I have to deal with overlapping indices (as seen in index 0 and 1 of the array above).

I figured the best way to handle overlapping would be to restructure the array like this:

[
     {beginIndex: 4, endIndex: 5, style: "bold"},
     {beginIndex: 5, endIndex: 7, style: "italics bold"},
     {beginIndex: 13, endIndex: 19, style: "italics"},
]

And then use the array to dynamically build some simple html that looks like this:

<span>The <span >d</span><span >og</span> runs <span >really</span> fast.</span>

I've spent a few hours trying to figure out how to take the input, and then generate an output that handles overlapping indices. I feel like this should be a pretty simple CS exercise, but I just can't seem to figure out how to do it. Does anybody have any advice on how to handle this situation. Or if this even is the best way to go about annotating text using vanilla js and html?

This image kind of illustrates what I'm looking for. 3 inputs that identify some indices to annotate, and the 4th line would be the output, combining all of the annotations onto one line.

Visualization of flattening the indexes

CodePudding user response:

I recommend splitting each range into operations that either remove or add a trait. Then you can sort them by index:

const ranges = [
    {beginIndex: 4, endIndex: 7, style: "bold"},
    {beginIndex: 5, endIndex: 7, style: "italics"},
    {beginIndex: 13, endIndex: 19, style: "italics"},
];

const operations = ranges.flatMap(range => [
    {index: range.beginIndex, style: range.style, enable: true},
    {index: range.endIndex, style: range.style, enable: false},
]);

operations.sort((a, b) => a.index - b.index);

const fragment = document.createDocumentFragment();
const container = document.createElement("span");
const text = "The dog runs really fast.";
let lastIndex = 0;

for (const op of operations) {
    if (lastIndex !== op.index) {
        fragment.appendChild(container.cloneNode(true))
            .textContent = text.substring(lastIndex, op.index);
        lastIndex = op.index;
    }
    container.classList.toggle(op.style, op.enable);
}

fragment.appendChild(container).textContent = text.substring(lastIndex);
document.body.appendChild(fragment);
.bold {
    font-weight: bold;
}

.italics {
    font-style: italic;
}

Be careful about what those indexes represent, though. The above code uses substring, which operates on UTF-16 code units; other common choices are UTF-8 code units (bytes) or Unicode scalar values. Test your code on emoji if applicable! ✨

CodePudding user response:

I'm not sure it's the best way, but to accomplish what you want to accomplish, something like this would convert your formatting array...

const formattingArray = [
  {beginIndex: 4, endIndex: 7, style: "bold"},
  {beginIndex: 5, endIndex: 7, style: "italics"},
  {beginIndex: 13, endIndex: 19, style: "italics"},
]

// assumes array is already sorted by beginIndex
let outputFormattingArray = [];
for ( let i = 0; i < formattingArray.length; i   ) {
  if ( formattingArray.length > i   1 && formattingArray[i].endIndex > formattingArray[i   1].beginIndex ) {
    outputFormattingArray.push( {
      beginIndex: formattingArray[i].beginIndex,
      endIndex: formattingArray[i   1].beginIndex,
      style: formattingArray[i].style
    } );
    outputFormattingArray.push( {
      beginIndex: formattingArray[i   1].beginIndex,
      endIndex: formattingArray[i].endIndex,
      style: 'bold italics'
    } );
    outputFormattingArray.push( {
      beginIndex: formattingArray[i].endIndex,
      endIndex: formattingArray[i   1].endIndex,
      style: formattingArray[i].style
    } );
    i  ;
  } else {
    outputFormattingArray.push( formattingArray[i] );
  }
}

console.log( outputFormattingArray );

To be honest though, it may just be better to insure input does not include overlapping tags at all, and throw an error if it does. HTML itself would not allow overlapping tags that way, why should you? they can easily format the data to conform to a validation check and still format in any way needed.

  • Related