Home > Software design >  Javascript getting xlink:href from XPath using document.evaluate
Javascript getting xlink:href from XPath using document.evaluate

Time:12-12

Hey all I am having the worst time trying to figure out why my XPath code below is not able to find the Image tag and the HREF link that goes with it within my document.

The XPath (full) looks like this:

//html/body/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/a/div/svg/g/descendant::image[starts-with(@href,'https://')]

The javascript code I am using is:

function checking(Path) {
   const nodes = document.evaluate(Path, document, null, XPathResult.ANY_TYPE, null);
   const result = {
       Data: []
   };
   let attr = nodes.iterateNext();
   result.Data.push({ href: attr});
   return JSON.stringify(result);
}
console.log(checking("//html/body/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/a/div/svg/g/descendant::image[starts-with(@href,'https://')]"));

And the HTML that I am looking through to get said image Xlink:HREF:

<body >
  <div id="" style="">
    <div>
      <div >
        <div >
          <div >
            <div >
              <div >
                <div >
                  <div >
                    <div >
                      <div  role="5ma">
                        <div >
                          <div >
                            <div >
                              <div>
                                <div >
                                  <div >
                                    <div >
                                      <a aria-label=""  href="https://www.this.com/link/is/not/needed" tabindex="0">
                                        <div >
                                          <svg aria-label=""  data-visualcompletion="ignore-dynamic" role="img" style="height: 168px; width: 168px;">
                                            <g mask="url(#)">
                                              <image x="0" y="0" height="100%" width="100%" xlink:href="https://www.google.com/logos/doodles/2021/seasonal-holidays-2021-6753651837109324-6752733080595603-cst.gif" style="height: 168px; width: 168px;"></image>
                                              <circle  cx="8" cy="4" r="4"></circle>
                                            </g>
                                          </svg>
                                        </div>
                                      </a>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</body>

I keep getting NULL for the output for some reason? And here is a enter image description here

Checking the "offical" xPath to my project comes out to be this:

// html/body/div1/div/div1/div/div3/div/div/div1/div1/div/div/div1/div2/div/div/div/div1/div/div/svg/g/image

Which I change the latest fiddle to reflect what @bigless suggested in his fiddle but still get null.

Newest fiddle

CodePudding user response:

Alternative to @Jack Fleeting answer (skipping all those divs just for example) with xlink:href selector:

string(//*[name() = 'svg']/*[name()='g']//*[name()='image' and starts-with(@*[name()='xlink:href'],'https://')]/@*[name()='xlink:href'])

This will extract just attribute value as a string (first occurence)

CodePudding user response:

A few things:

First, you have a problem with namespaces and Deprecated XLink URL reference attributes.

Second, in

result.Data.push({
    href: attr
  });

you should push the node value of the attribute:

result.Data.push({
    href: attr.nodeValue
  });

Finally, because of the namespace issue, and to simplify the xpath expression, change your comeback to

var comeback = checking("//*[local-name()='image'][starts-with(./@href,'https://')]/@href");

And it should work as in this fiddle.

  • Related