Home > Mobile >  Can't find element by XPath on certain website
Can't find element by XPath on certain website

Time:11-17

My goal is to be able to scrape definitions of words in python.

To begin with, I am trying to get just the first definition of the word "assist" which should be "to help". I am using dictionary.cambridge.org

//web driver goes to page
driver.get("https://dictionary.cambridge.org/dictionary/english/assist") 

//to give time for the page to load
time.sleep(4) 

//click "accept cookies"   
driver.find_element_by_xpath("/html[@class='i-amphtml-singledoc i-amphtml-standalone']/body[@class='break default_layout amp-mode-mouse']/div[@id='onetrust-consent-sdk']/div[@id='onetrust-banner-sdk']/div[@class='ot-sdk-container']/div[@class='ot-sdk-row']/div[@id='onetrust-button-group-parent']/div[@id='onetrust-button-group']/div[@class='banner-actions-container']/button[@id='onetrust-accept-btn-handler']").click()

Up this point, everything is working correctly. However, when I try to print the first definition using "find element by xpath", I get a NoSuchElementException. I'm pretty familiar with selenium and have scraped web stuff hundreds of times before but on this webpage, I don't know what I'm doing wrong. Here's the code I am using:

 print(driver.find_element_by_xpath("/html[@class='i-amphtml-singledoc i-amphtml-standalone']/body[@class='break default_layout amp-mode-mouse']/div[@class='cc fon']/div[@class='pr cc_pgwn']/div[@class='x lpl-10 lpr-10 lpt-10 lpb-25 lmax lp-m_l-20 lp-m_r-20']/div[@class='hfr-m ltab lp-m_l-15']/article[@id='page-content']/div[@class='page']/div[@class='pr dictionary'][1]/div[@class='link']/div[@class='pr di superentry']/div[@class='di-body']/div[@class='entry']/div[@class='entry-body']/div[@class='pr entry-body__el'][1]/div[@class='pos-body']/div[@class='pr dsense dsense-noh']/div[@class='sense-body dsense_b']/div[@class='def-block ddef_block ']/div[@class='ddef_h']/div[@class='def ddef_d db']").text())

CodePudding user response:

Instead of Absolute xpath, opt for Relative xpaths. You can refer this link

Tried with below code and it retrieved the data.

driver.get("https://dictionary.cambridge.org/dictionary/english/assist")

print(driver.find_element_by_xpath("(//div[@class='ddef_h'])[1]/div").get_attribute("innerText"))
to help:

CodePudding user response:

To print the scrape definitions of words you can use either of the following Locator Strategies:

  • Using xpath and text attribute:

    print(driver.find_element_by_xpath("//span[contains(@class, 'epp-xref dxref')]//following::div[1]").text)
    
  • Using xpath and innerText:

    print(driver.find_element_by_xpath("//span[contains(@class, 'epp-xref dxref')]//following::div[1]").get_attribute("innerText"))
    
  • Console Output:

    to help:
    
  • Related