Hey all I am having the worst time trying to figure out why my XPath code below is not able to find the Image tag and the HREF link that goes with it within my document.
The XPath (full) looks like this:
//html/body/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/a/div/svg/g/descendant::image[starts-with(@href,'https://')]
The javascript code I am using is:
function checking(Path) {
const nodes = document.evaluate(Path, document, null, XPathResult.ANY_TYPE, null);
const result = {
Data: []
};
let attr = nodes.iterateNext();
result.Data.push({ href: attr});
return JSON.stringify(result);
}
console.log(checking("//html/body/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/a/div/svg/g/descendant::image[starts-with(@href,'https://')]"));
And the HTML that I am looking through to get said image Xlink:HREF:
<body >
<div id="" style="">
<div>
<div >
<div >
<div >
<div >
<div >
<div >
<div >
<div >
<div role="5ma">
<div >
<div >
<div >
<div>
<div >
<div >
<div >
<a aria-label="" href="https://www.this.com/link/is/not/needed" tabindex="0">
<div >
<svg aria-label="" data-visualcompletion="ignore-dynamic" role="img" style="height: 168px; width: 168px;">
<g mask="url(#)">
<image x="0" y="0" height="100%" width="100%" xlink:href="https://www.google.com/logos/doodles/2021/seasonal-holidays-2021-6753651837109324-6752733080595603-cst.gif" style="height: 168px; width: 168px;"></image>
<circle cx="8" cy="4" r="4"></circle>
</g>
</svg>
</div>
</a>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
I keep getting NULL for the output for some reason? And here is a
Checking the "offical" xPath to my project comes out to be this:
// html/body/div1/div/div1/div/div3/div/div/div1/div1/div/div/div1/div2/div/div/div/div1/div/div/svg/g/image
Which I change the latest fiddle to reflect what @bigless suggested in his fiddle but still get null.
Newest fiddle
CodePudding user response:
Alternative to @Jack Fleeting answer (skipping all those divs just for example) with xlink:href selector:
string(//*[name() = 'svg']/*[name()='g']//*[name()='image' and starts-with(@*[name()='xlink:href'],'https://')]/@*[name()='xlink:href'])
This will extract just attribute value as a string (first occurence)
CodePudding user response:
A few things:
First, you have a problem with namespaces and Deprecated XLink URL reference attributes.
Second, in
result.Data.push({
href: attr
});
you should push
the node value of the attribute:
result.Data.push({
href: attr.nodeValue
});
Finally, because of the namespace issue, and to simplify the xpath expression, change your comeback
to
var comeback = checking("//*[local-name()='image'][starts-with(./@href,'https://')]/@href");