Home > Enterprise >  Python & Selenium: How to get values generated by JavaScript
Python & Selenium: How to get values generated by JavaScript

Time:04-28

I use Selenium in Python for scraping. I can't get values though these values are displayed on the browser.

So I checked the HTML source code, then I found that there are no values in HTML as below.

HTML

<div id="pos-list-body" >

</div>

But there are values when I checked developer tool in chrome.

DevTools

<div id="pos-list-body" >
    <div  id="pos-row-1">
        <div >
            <input  type="checkbox" value="1">
        </div>
        <div >
            1
        </div>
        <div >
            a
        </div>
        ...
    </div>
    <div  id="pos-row-2">
        <div >
            <input  type="checkbox" value="2">
        </div>
        <div >
            2
        </div>
        <div >
            b
        </div>
        ...
    </div>
    ...
</div>

It seems that these values generated by JavaScript or something.

There is no iframe in sorce code.

How can I get these values with python?

It would be appreciated if you could give me some hint.

CodePudding user response:

If ID pos-list-body is unique in HTML-DOM, then your best bet is to use explicit wait with innerText

Code:

wait = WebDriverWait(driver, 20)
print(wait.until(EC.presence_of_element_located((By.ID, "pos-list-body"))).get_attribute('innerText'))

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

CodePudding user response:

Element.outerHTML

The outerHTML attribute of the Element gets the serialized HTML fragment describing the element including its descendants. It can also be set to replace the element with nodes parsed from the given string. However to only obtain the HTML representation of the contents of an element ideally you need to use the innerHTML property instead. So reading the value of outerHTML returns a DOMString containing an HTML serialization of the element and its descendants. Setting the value of outerHTML replaces the element and all of its descendants with a new DOM tree constructed by parsing the specified htmlString.


Solution

To get the html generated by JavaScript you can use the following solution:

print(driver.execute_script("return document.getElementById('pos-list-body').outerHTML"))
  • Related