Home > Software design >  How to brake html into <div> sections using jsdom
How to brake html into <div> sections using jsdom

Time:10-10

I have automatically generated HTML structure like that:

<!DOCTYPE html>
<html>
  <body>
    <h1>My First Heading</h1>
    <p>My first paragraph.</p>
    <h2>Subheading A</h2>
    <p>Subheading content 1</p>
    <ul>
      <li>bla</li>
    </ul>
    <p>Subheading content 2</p>
    <h2>Subheading B</h2>
    <p>Subheading content 1</p>
    <p>Subheading content 2</p>
    <p>Subheading content 3</p>
  </body>
</html>
  • Html could have any number of Subheadings (h2).
  • There could be any number of HTML elements between Subheadings.
  • I want to wrap every subheading and following tags into a <div>

Here is an example

   <!DOCTYPE html>
   <html>
      <body>
        <h1>My First Heading</h1>
        <p>My first paragraph.</p>
        <div>
          <h2>Subheading A</h2>
          <p>Subheading content 1</p>
          <ul>
            <li>bla</li>
          </ul>
          <p>Subheading content 2</p>
        </div>
        <div>
          <h2>Subheading B</h2>
          <p>Subheading content 1</p>
          <p>Subheading content 2</p>
          <p>Subheading content 3</p>
        </div>
      </body>
    </html>

That is some code I've come up with

const dom = new JSDOM(myHtmlString);

orig_html = dom.window.document.getElementById("h2").innerHTML;
new_html = "<div>"   org_html   "</div>";
dom.window.document.getElementById("h2").innerHTML = new_html;

However, I need to find all the elements that will belong to the current h2 and split the HTML into sections to do so. Do you have any ideas what is the best way to achieve that?

UPDATE:

What do you think, may be using jsdom in that case is not a best idea? Would it be better just achieving that outcome using text manipulation?

CodePudding user response:

Determine all <h2> elements (for example, with document.querySelectorAll) and then, for each <h2>, start a new <div> that contains it and all following siblings until meeting another <h2>.

var new_html = "";
for (var h2 of document.querySelectorAll("h2")) {
  if (!new_html) // include preamble before first h2
    for (var content, section = document.evaluate("preceding-sibling::*", h2,
         () => {}, XPathResult.ORDERED_NODE_ITERATOR_TYPE);
     content = section.iterateNext(); )
      new_html  = content.outerHTML;
  new_html  = "<div>"   h2.outerHTML;
  for (var elem = h2;
       (elem = document.evaluate("following-sibling::*[1]", elem,
        () => {}, XPathResult.FIRST_ORDERED_NODE_TYPE)) &&
       elem.nodeName !== "h2"; )
    new_html  = elem.outerHTML;
  new_html  = "</div>"
}
document.body.innerHTML = new_html;

This solution works properly only if all <h2> elements are on the same level.

But it is anyway unclear what you would want in case of

<h1>My First Heading</h1>
<h2>My Second Heading</h2>
<div>
  Some text.
  <h2>A Heading At the Wrong Level</h2>
  Some more text.
</div>

CodePudding user response:

One Solution could be You should use defined custom id name for heading and subheading so that when you traverse the dom you know where the breakpoint exits and there i have to stop . Like define id for subheading "subheading1" and add elements until you find "subheading2".

  • Related