Home > OS >  How can I get array of #text nodes of the html tree
How can I get array of #text nodes of the html tree

Time:05-16

I need to use all the #text elements of an html body as an array. The rich text can have various levels so I need to get to the lowest element. For example for the text below I'm expecting to have an array of 8 elements.

enter image description here

What is the name or tag or method to get the # text node?

CodePudding user response:

You need to specify the first parent tag and use innerText attribute.

<script>
var text = document.getElementsByTagName("body")[0].innerText;
console.log(text.replace(/(\r\n|\n|\r|\t|\s)/gm, ''));
</script>

or if you want to use jquery , you can do like this.

console.log($("body span").text().replace(/(\r\n|\n|\r|\t)/gm, ''));

CodePudding user response:

You can recursively scan through the nodes and push the text nodes into an array.

const textNodes = []

function pushTextNode(node) {
  if (node.nodeName === "#text") {
    const nodeVal = node.nodeValue.trim();
    if (nodeVal) {
      textNodes.push(nodeVal);
    }
    return;
  }
  node.childNodes.forEach((childNode) => {
    pushTextNode(childNode)
  });
}

pushTextNode(document.querySelector("#root"));
console.log(textNodes);
<div id="root">
  <span>
    0
    <b>
      12<u>3</u>
    </b>
    <u>
      4<b>5</b>
    </u>
    <b>67</b>8<a href="#">9</a>
  </span>
</div>

  • Related