This page has dynamic loading with different elements loading at different times:
I use an element that I've noticed takes a little longer than the others:
WebDriverWait(driver, 30).until(
EC.element_to_be_clickable(
(By.XPATH, "//div[contains(@class,'tooltip-vitrine')]")
But I would like to know if there is any way to track the sequence that elements are loaded on pages to find a pattern and use an element that always takes longer than the others, giving greater confidence about the complete loading of the page.
CodePudding user response:
Try this solution out, you have to check the document.readyState
and wait until complete
is returned.
CodePudding user response:
I don't know if there is a way to check if the page is fully loaded, hence I don't know if there is a solid method to find the last loaded element.
A naive method is to check the number of elements in the page as the page loads: the number should increase and stops when the page is fully loaded.
(notice that the part about document.readyState
was added to check if the answer by Roman J could work, but it doesn't seem to work since it prints complete
even if new elements are loaded next)
driver.get('https://globo.com')
lists_of_elements = [[]]
time_old = time.time()
# maximum waiting time in seconds
max_wait = 50
while 1:
# find all elements in the page
elements = driver.find_elements(By.XPATH, '//*')
time_new = time.time()
# compare the number of elements between the new list and the previous list
if len(elements) != len(lists_of_elements[-1]):
print(f'loaded elements: {len(elements)} - doc state: {driver.execute_script("return document.readyState")}')
lists_of_elements.append(elements)
time_old = time_new
if time_new - time_old > max_wait:
print('page seems to be fully loaded')
break
Output
loaded elements: 3053 - doc state: complete
loaded elements: 3054 - doc state: complete
loaded elements: 3153 - doc state: complete
loaded elements: 3152 - doc state: complete
loaded elements: 3156 - doc state: complete
loaded elements: 3160 - doc state: complete
page seems to be fully loaded
Then to see which are the last loaded elements (i.e. their html code) just run the following
# compute the difference between the last two lists
last_loaded_elements = list(set(lists_of_elements[-1]) - set(lists_of_elements[-2]))
for idx, el in enumerate(last_loaded_elements):
print(f"element {idx}\n{el.get_attribute('outerHTML')}\n")
Output
element 0
<link rel="preload" href="https://adservice.google.it/adsid/integrator.js?domain=www.globo.com" as="script">
element 1
<script type="text/javascript" src="https://adservice.google.com/adsid/integrator.js?domain=www.globo.com"></script>
element 2
<iframe frameborder="0" src="https://a3b68e638f6dccabe7e288ddc2ab6c43.safeframe.googlesyndication.com/safeframe/1-0-40/html/container.html" id="google_ads_iframe_/95377733/tvg_Globo.com.Home_0" title="3rd party ad content" name="" scrolling="no" marginwidth="0" marginheight="0" width="970" height="250" data-is-safeframe="true" sandbox="allow-forms allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-top-navigation-by-user-activation" role="region" aria-label="Advertisement" tabindex="0" data-google-container-id="1" style="border: 0px; vertical-align: bottom;" data-load-complete="true"></iframe>
element 3
<script type="text/javascript" src="https://adservice.google.it/adsid/integrator.js?domain=www.globo.com"></script>
element 4
<link rel="preload" href="https://adservice.google.com/adsid/integrator.js?domain=www.globo.com" as="script">
element 5
<div id="google_ads_iframe_/95377733/tvg_Globo.com.Home_0__container__" style="border: 0pt none; margin: auto; text-align: center; width: 970px; height: 250px;"><iframe frameborder="0" src="https://a3b68e638f6dccabe7e288ddc2ab6c43.safeframe.googlesyndication.com/safeframe/1-0-40/html/container.html" id="google_ads_iframe_/95377733/tvg_Globo.com.Home_0" title="3rd party ad content" name="" scrolling="no" marginwidth="0" marginheight="0" width="970" height="250" data-is-safeframe="true" sandbox="allow-forms allow-popups allow-popups-to-escape-sandbox allow-same-origin allow-scripts allow-top-navigation-by-user-activation" role="region" aria-label="Advertisement" tabindex="0" data-google-container-id="1" style="border: 0px; vertical-align: bottom;" data-load-complete="true"></iframe></div>