Home > other >  How to replace overlapping strings in Javascript without destroying the HTML structure
How to replace overlapping strings in Javascript without destroying the HTML structure

Time:02-12

I have a string and an array of N items:

<div>
sometimes the fox can fly really high
</div>
const arr = ['the fox can', 'fox can fly', 'really high']`

I want to find a way to replace the text inside the div with HTML to highlight those specific phrases inside the array without breaking the HTML. This can be problematic because I can't do a simple loop and replace because then other words will not match after a replacement because the highlight span would break something like indexOf or includes on the innerHTML, sure I can use innerText to read the text but it doesn't provide anything that makes it so I can add the "next" span without breaking the original HTML highlights. Ideally, I also want to be able to customize the class name depending on the word I use rather than just a generic highlight class too.

The outcome should be

<div>
sometimes
<span >the <span >fox can</span></span><span > fly</span> <span >really high</span>
</div>

What have I tried?

I've really thought about this and cannot find any resources online that help with this scenario and the main, Currently, I also need extra values such as charStart and charEnd of the word, I don't like this solution because it depends on using the DOMParser() API and it feels really hacky, definitely isn't performant and I just get a "vibe" that I shouldn't be doing this method and there must be better solutions, I am reaching out to SO for ideas on how I can accomplish this challenge.

      let text = `<p id="content">${content}</p>`
      let parser = new DOMParser().parseFromString(text, "text/html")

      for (const str of strings) {
        const content = parser.querySelector("#content")
        let descLength = 0

        for (const node of content.childNodes) {
          const text = node.textContent

          let newTextContent = ""

          for (const letter in text) {
            let newText = text[letter]
            if (descLength === str.charStart) {
              newText = `<em  data-id="${str.id}">${text[letter]}`
            } else if (descLength === str.charEnd) {
              newText = `${text[letter]}</em>`
            }

            newTextContent  = newText
            descLength  
          }

          node.textContent = newTextContent
        }

        // Replace the &lt; with `<` and replace &gt; with `>` to construct the HTML as text inside lastHtml
        const lastHtml = parser
          .querySelector("#content")
          .outerHTML.split("&lt;")
          .join("<")
          .split("&gt;")
          .join(">")

        // Redefine the parser variable with the updated HTML and let it automatically correct the element structure
        parser = new DOMParser().parseFromString(lastHtml, "text/html")

        /**
         * Replace the placeholder `<em>` element with the span elements to prevent future issues. We need the HTML
         * to be invalid for it to be correctly fixed by DOMParser, otherwise the HTML would be valid and *not* render how we'd like it to
         * Invalid => `<span>test <em>title </span>here</em>
         * Invalid (converted) => `<span>test <em>title </em></span><em>here</em>
         * Valid => `<span>test <span>title </span>here</span>
         */

        parser.querySelector("#content").innerHTML = parser
          .querySelector("#content")
          .innerHTML.replaceAll("<em ", "<span ")
          .replaceAll("</em>", "</span>")
      }

CodePudding user response:

I'll go over your example just to give an idea. Below code is not a clean function, please adjust it according to your needs.

const str = "sometimes the fox can fly really high";
const arr = ['the fox can', 'fox can fly', 'really high'];

// First, find the indices of start and end positions for your substrings.
// Call them event points and push them to an array.
eventPoints = [];
arr.forEach((a, i) => {
  let index = strLower.indexOf(a)
  while (index !== -1) {
    let tagClass = `highlight-${i}`
    eventPoints.push({ pos: index, className: tagClass, eventType: "start" })
    eventPoints.push({ pos: index   a.length, className: tagClass, eventType: "end" })
    index = strLower.indexOf(a, index   1)
  }
  return
});

// Sort the event points based on the position properties
eventPoints.sort((a, b) => a.pos < b.pos ? -1 : a.pos > b.pos ? 1 : 0);

// Init the final string, a stack and an index to keep track of the current position on the full string
let result = "";
let stack = [];
let index = 0;
// Loop over eventPoints
eventPoints.forEach(e => {
    // concat the substring between index and e.pos to the result
    result  = str.substring(index, e.pos);
    if (e.eventType === "start") {
        // when there is a start event, open a span
        result  = `<span >`;
        // keep track of which span is opened
        stack.push(e.className);
    }
    else {
        // when there is an end event, close tags opened after this one, keep track of them, reopen them afterwards
        let tmpStack = [];
        while (stack.length > 0) {
            result  = "</span>";
            let top = stack.pop();
            if (top === e.className) {
                break;
            }
            tmpStack.push(top);
        }
        while (tmpStack.length > 0) {
            let tmp = tmpStack.pop();
            result  = `<span >`;
            stack.push(tmp);
        }
    }
    index = e.pos;
});

result  = str.substring(index, str.length)


console.log(result);
  • Related