I am trying to retrieve a list of elements from a webpage that contain a specific string. I'm evaluating the XPath using a selenium ISearchContext.FindElement(By.XPath(...))
, so the XPath engine is the one supported by the browser.
Below is a minimal sample page created by me, but in a real world scenario I cannot modify the page's structure:
<html lang="en">
<head>
<title></title>
</head>
<body>
<nav>
<ul>
<li>Nav Item 1</li>
<li>Nav Item 2</li>
</ul>
</nav>
<main>
<div >
<p >Lorem ipsum dolor sit amet, consectetur adipisicing elit. Ab deserunt dolore eius est
facere incidunt magni
minus molestiae natus, nesciunt nostrum, officia perspiciatis placeat provident reiciendis saepe sed, sit
voluptates.</p>
<p >1.150<span >,</span><sup>00</sup> <span >EUR</span></p>
</div>
<aside>
<p >1.150<span >,</span><sup>00</sup> <span >EUR</span></p>
</aside>
<footer>
1<span>.</span>150<span >,</span><sup>00</sup> <span >EUR</span>
</footer>
</main>
</body>
</html>
The elements I'm looking for are [p.price, p.price, footer]
; basically only the "deepest" elements containing the string 1.150,00 EUR
.
If I evaluate the following XPath in Chrome's Developer Tools Console
$x("//*[contains(., '1.150,00 EUR')][contains(ancestor::*, '1.150,00 EUR')][not(contains(child::*, '1.150,00 EUR'))]")
the result I get is [body, div.content, p.price, p.price, footer]
.
I'm looking for a solution that still lets me remain in the Selenium context so I can further access element properties (such as element position). It doesn't necessarily have to involve XPath and I am also open to using another browser.
CodePudding user response:
Given that HTML tree, this xpath selects the requested elements
$x("//*[.='EUR']/parent::*[contains(., '1.150,00 EUR')]")
Result
Array(3) [ p.price, p.price, footer ]
CodePudding user response:
//*[contains(., '1.150,00 EUR')]
[not(
*[contains(., '1.150,00 EUR')]
)]
= "all elements whose text value contains '1.150,00 EUR' and don't have a child element whose text value contains '1.150,00 EUR'"