Home > Software design >  Selenium Find Only the "Deepest" Elements Containing a Specific String
Selenium Find Only the "Deepest" Elements Containing a Specific String

Time:10-24

I am trying to retrieve a list of elements from a webpage that contain a specific string. I'm evaluating the XPath using a selenium ISearchContext.FindElement(By.XPath(...)), so the XPath engine is the one supported by the browser.

Below is a minimal sample page created by me, but in a real world scenario I cannot modify the page's structure:

<html lang="en">
<head>
    <title></title>
</head>
<body>
<nav>
    <ul>
        <li>Nav Item 1</li>
        <li>Nav Item 2</li>
    </ul>
</nav>
<main>
    <div >
        <p >Lorem ipsum dolor sit amet, consectetur adipisicing elit. Ab deserunt dolore eius est
            facere incidunt magni
            minus molestiae natus, nesciunt nostrum, officia perspiciatis placeat provident reiciendis saepe sed, sit
            voluptates.</p>
        <p >1.150<span >,</span><sup>00</sup> <span >EUR</span></p>
    </div>
    <aside>
        <p >1.150<span >,</span><sup>00</sup> <span >EUR</span></p>
    </aside>
    <footer>
        1<span>.</span>150<span >,</span><sup>00</sup> <span >EUR</span>
    </footer>
</main>
</body>
</html>

The elements I'm looking for are [p.price, p.price, footer]; basically only the "deepest" elements containing the string 1.150,00 EUR.

If I evaluate the following XPath in Chrome's Developer Tools Console

$x("//*[contains(., '1.150,00 EUR')][contains(ancestor::*, '1.150,00 EUR')][not(contains(child::*, '1.150,00 EUR'))]")

the result I get is [body, div.content, p.price, p.price, footer].

I'm looking for a solution that still lets me remain in the Selenium context so I can further access element properties (such as element position). It doesn't necessarily have to involve XPath and I am also open to using another browser.

CodePudding user response:

Given that HTML tree, this xpath selects the requested elements

$x("//*[.='EUR']/parent::*[contains(., '1.150,00 EUR')]")

Result

Array(3) [ p.price, p.price, footer ]

CodePudding user response:

//*[contains(., '1.150,00 EUR')]
   [not(
      *[contains(., '1.150,00 EUR')]
   )]

= "all elements whose text value contains '1.150,00 EUR' and don't have a child element whose text value contains '1.150,00 EUR'"

  • Related