Home > Back-end >  Selenium cannot extract text
Selenium cannot extract text

Time:04-15

I am trying to extract some text from this page

enter image description here

In particular I want to extract the text between the tags. I am using Selenium and the following code but even though the object is recognized, the text is an empty string. Below is the code I am using:

testo = driver.find_element_by_xpath('/html/body/span/pre[1]').text

What do think think it could be the issue?

CodePudding user response:

The text within <pre> tag is within an <iframe>

So to extract the desired text you have to:

  • Induce WebDriverWait for the desired frame to be available and switch to it.

  • Induce WebDriverWait for the desired element to be clickable.

  • You can use either of the following Locator Strategies:

    • Using CSS_SELECTOR:

      WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe#mainFrame")))
      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.dettaglio_atto_testo"))).get_attribute("innerHTML"))
      
    • Using XPATH:

      WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@id='mainFrame']")))
      print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[@class='dettaglio_atto_testo']/pre"))).text)
      
  • Note : You have to add the following imports :

     from selenium.webdriver.support.ui import WebDriverWait
     from selenium.webdriver.common.by import By
     from selenium.webdriver.support import expected_conditions as EC
    

CodePudding user response:

Firstly you should switch to iframe. And then you can use .getText() method. If it does'nt work you can try this: .getAttribute("innerText")

  • Related