Home > Software engineering >  Selenium will not load full Page Source, only partially through CSS styles and then cuts off
Selenium will not load full Page Source, only partially through CSS styles and then cuts off

Time:11-20

I have tried looking through several answers on Stack Overflow, all to no avail. When I print the page source of the webpage, I can only see the source up to a certain point within a tag, give or take a few characters. The HTML elements beyond are never loaded or printed out in the page source. When I attempt to load HTML elements that should be present (they're there when I view page source on Chrome), I get either a TimeoutException or a NoSuchElementException.

I'm parsing a dynamically loaded website after passing through a multi-factor auth portal. I've printed driver.current_url to ensure I am on the correct URL after MFA, have tried sleep(100) and tried explicitly waiting for EC.url_contains(...), EC.element_to_be_clickable(...), and EC.presence_of_element_located(...).

Here is my code:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://brightspace.nyu.edu/d2l/home"

driver = webdriver.Chrome()     # should open a Chrome window
driver.get(url)         # navigate to brightspace

# MFA Handling Code here #

# Explicitly wait until we reach the Brightspace home page (logged in)   
element = WebDriverWait(driver,100).until(EC.url_contains('https://brightspace.nyu.edu/d2l/home'))
print(driver.page_source)
banner = driver.find_element_by_id('bannerTitle')   # throws NoSuchElementException

This is part of the output:

        <!-- ... previous styles and HTML in <head> ... -->
        <style is="custom-style">html {
                        --d2l-color-woolonardo: var(--d2l-color-sylvite);
                        .
                        .   lots of colors
                        .
                        --d2l-color-olivine-light-1: var(--d2l-color-olivine-plus-1);
                        --d2l
                        <!-- ^^ the page source cuts off here, in <head> -->

with the last line giving the following error:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="bannerTitle"]"}

CodePudding user response:

In stead of banner = driver.find_element_by_id I would suggest using WebDriverWait, By and EC. I would also put print(driver.page_source) after the banner is found. We can try scrolling down the page as well. Below I commented out some of your lines and added my suggested updates.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://brightspace.nyu.edu/d2l/home"

driver = webdriver.Chrome()     # should open a Chrome window
driver.get(url)         # navigate to brightspace

# MFA Handling Code here #

# Explicitly wait until we reach the Brightspace home page (logged in)   
element = WebDriverWait(driver,100).until(EC.url_contains('https://brightspace.nyu.edu/d2l/home'))
# print(driver.page_source)
# banner = driver.find_element_by_id('bannerTitle')   # throws NoSuchElementException
##################################
######## NEW SUGGESTIONS #########
##################################
banner = WebDriverWait(self.driver, 100).until(EC.visibility_of_element_located(
        (By.ID, "bannerTitle")))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
print(driver.page_source)
  • Related