Home > Software design >  How can I find elements that are not in the page source using selenium (python)
How can I find elements that are not in the page source using selenium (python)

Time:09-28

Currently I'm trying to scrape something from a website. For that I need content of an email and so I use yopmail for that (https://yopmail.com). In yopmail you have the mails on the left side on the screen with the mail subject under it. This text is the part I need. [the mail view][1] [the devtools code][2]

The problem now is that this code is not available in the page source. For what I red online it can be caused by javascript generation although, I'm not sure that is exactly the problem

I've tried multiple solutions:

attempt 1: using beautifulSoup and locate the element (failed because not in the page source)

attempt 2: tried locate element with xpath with the selenium driver (also unable to find)

attempt 3: get the inner html of the body (still not available in that html)

driver.find_element_by_tag_name('body').get_attribute('innerHTML')

It feels like nothing works and also the other related posts here dont give me an answer that helps. Is there anyone who can help me with this? [1]: https://i.stack.imgur.com/vTi0s.png [2]: https://i.stack.imgur.com/nmBZ8.png

CodePudding user response:

It seems like the element you are trying to get is inside an iframe, that's why you are not able to locate it. So first you have to switch to the iframe by using:

WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.ID ,'ifinbox')))

element = driver.find_element(By.XPATH, "//div[@class='lms']")
print(element.text)

When you are done you can switch back to default content by using

driver.switch_to.default_content()

NOTE: You need to import the following

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
  • Related