Home > Blockchain >  XPath parent::node() not working as expected
XPath parent::node() not working as expected

Time:06-28

I'm trying to scrape multiple pages they have some measures but they don't have the same order in all pages so i have to check in every page which measure is that.. so i've tried to get the parent node of the following text : SO,NO and CO to check which element is that and then put it in the right place from the following html document:

<ul >
    <li>
       <p >SO₂</p>
       <h1 >0.00</h1>
       <strong>ppb</strong><p>2022/06/13 07:00</p> 
    </li>
    <li>
       <p >NO₂</p>
       <h1 >1.00</h1> 
       <strong>ppb</strong><p>2022/06/26 20:00</p>
   </li>
   <li>
     <p >CO</p>
     <h1 >0.00</h1>
     <strong>ppb</strong>
     <p>2021/07/07 04:00</p>
   </li>
</ul>

i've tried something like this: '''

elements_name = ['PM10','PM2.5',"PM1","CO","SO","O","NO"]
for element in elements_name:
    driver.find_element_by_xpath(f"//ul[@class='sc-hHftDr gOAyWd']//li//p[contains(., 
 {element})]").find_element_by_xpath("parent::node()").find_element_by_css_selector('h1[]').text.strip())

but the problem is that parent::node() pulls the 'SO' element for every element_name each time, it does not get the right parent of the node I also tried

('..') and ('parent::li')

CodePudding user response:

I think the contains() function will return an unexpected result at least sometimes, because e.g. contains('SO₂', 'O') is true, and so is contains('PM10', 'PM1'). I think you should just use the = operator instead of contains().

You should be able to use a single XPath expression. Something like this:

driver.find_element_by_xpath(
   f"//ul[@class='sc-hHftDr gOAyWd']"
   f"/li[p[@class='sc-bkzZxe card__subtitle']='{element}')]"
   f"/h1[@class='sc-idOhPF ipvImd card__highlight-text']"
).text.strip())

=

  • search the entire document for the ul,
  • select the child li whose subtitle exactly matches (not contains!) the element parameter,
  • return the h1 child of that li.
  • Related