Home > Mobile >  Selenium & BeautifulSoup cannot find Fields table elements within #shadow_root
Selenium & BeautifulSoup cannot find Fields table elements within #shadow_root

Time:08-12

I have been trying to scrape data from the following site: enter image description here

CodePudding user response:

@CalGrace The page contains Shadow root. You can go through Shadow DOM in Selenium for more details.

The following code should work for Chrome browser. :

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
s=Service(ChromeDriverManager().install())
driver = webdriver.Chrome(options=chrome_options, service=s)

driver.get(
    "https://developer.salesforce.com/docs/atlas.en-us.netzero_cloud_dev_guide.meta/netzero_cloud_dev_guide/sforce_api_objects_airtravelemssnfctr.htm#maincontent")
main_content = driver.find_element(By.CSS_SELECTOR, "main[id='maincontent'] doc-xml-content")
main_content_shadow_root = main_content.shadow_root
doc_content = main_content_shadow_root.find_element(By.CSS_SELECTOR, "doc-content")
doc_content_shadow_root = doc_content.shadow_root;
table = doc_content_shadow_root.find_element(By.CSS_SELECTOR, ".featureTable.sort_table")
print(table.tag_name)

CodePudding user response:

shadow_root

The shadow_root attribute returns a shadow root of the element if there is one or an error. Only works from Chromium 96 onwards. Previous versions of Chromium based browsers will throw an assertion exception.


Solution

Using v96 (and above) and to access the Fields table, as the table element is within multiple #shadow-root (open) you can use the following locator strategies:

  • Code Block:

    driver = webdriver.Chrome(service=s, options=options)
    driver.execute("get", {'url': 'https://developer.salesforce.com/docs/atlas.en-us.netzero_cloud_dev_guide.meta/netzero_cloud_dev_guide/sforce_api_objects_airtravelemssnfctr.htm#maincontent'})
    shadow_host = driver.find_element(By.CSS_SELECTOR, 'doc-xml-content')
    shadow_root = shadow_host.shadow_root
    shadow_child = shadow_root.find_element(By.CSS_SELECTOR, 'doc-content')
    shadow_grand_child = shadow_child.shadow_root
    element = shadow_grand_child.find_element(By.CSS_SELECTOR, 'table.featureTable')
    print(element.get_attribute("outerHTML"))
    driver.quit()
    
  • Console Output:

    <table  summary="">
    
    
                   <thead  align="left">
                      <tr>
                         <th  id="d51659e96">Field</th>
    
                         <th  id="d51659e99">Details</th>
    
                      </tr>
    
                   </thead>
    
                   <tbody >
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >Ch4PsgrKmLongHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The CH4 emissions per passenger-kilometer in CO2e from long-haul
                                     flights. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >Ch4PsgrKmMediumHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The CH4 emissions per passenger-kilometer in CO2e from
                                     medium-haul flights. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >DistanceUnit</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >picklist</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Defaulted on create, Filter, Group, Nillable, Restricted
                                     picklist, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The unit of measure for the distance. </dd>
    
                                  <dd >Possible values are: <ul >
                                        <li ><samp >Kilometers</samp></li>
    
                                        <li ><samp >Miles</samp></li>
    
                                     </ul>
    
                                  </dd>
    
                                  <dd >The default value is 'Kilometers'.</dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >EmissionFactorDataSource</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >textarea</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Nillable, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The source of the emissions factor reference data. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >EmissionFactorUpdateYear</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >picklist</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Group, Nillable, Restricted picklist, Sort,
                                     Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The year in which this reference data for the emissions factor
                                     was most recently updated. </dd>
    
                                  <dd >Possible values are: <ul >
    
                                        <li ><samp >2000</samp></li>
    
                                        <li ><samp >2001</samp></li>
    
                                        <li ><samp >2002</samp></li>
    
                                        <li ><samp >2003</samp></li>
    
                                        <li ><samp >2004</samp></li>
    
                                        <li ><samp >2005</samp></li>
    
                                        <li ><samp >2006</samp></li>
    
                                        <li ><samp >2007</samp></li>
    
                                        <li ><samp >2008</samp></li>
    
                                        <li ><samp >2009</samp></li>
    
                                        <li ><samp >2010</samp></li>
    
                                        <li ><samp >2011</samp></li>
    
                                        <li ><samp >2012</samp></li>
    
                                        <li ><samp >2013</samp></li>
    
                                        <li ><samp >2014</samp></li>
    
                                        <li ><samp >2015</samp></li>
    
                                        <li ><samp >2016</samp></li>
    
                                        <li ><samp >2017</samp></li>
    
                                        <li ><samp >2018</samp></li>
    
                                        <li ><samp >2019</samp></li>
    
                                        <li ><samp >2020</samp></li>
    
                                        <li ><samp >2021</samp></li>
    
                                        <li ><samp >2022</samp></li>
    
                                        <li ><samp >2023</samp></li>
    
                                        <li ><samp >2024</samp></li>
    
                                        <li ><samp >2025</samp></li>
    
                                        <li ><samp >2026</samp></li>
    
                                        <li ><samp >2027</samp></li>
    
                                        <li ><samp >2028</samp></li>
    
                                        <li ><samp >2029</samp></li>
    
                                        <li ><samp >2030</samp></li>
    
                                        <li ><samp >2031</samp></li>
    
                                        <li ><samp >2032</samp></li>
    
                                        <li ><samp >2033</samp></li>
    
                                        <li ><samp >2034</samp></li>
    
                                        <li ><samp >2035</samp></li>
    
                                        <li ><samp >2036</samp></li>
    
                                        <li ><samp >2037</samp></li>
    
                                        <li ><samp >2038</samp></li>
    
                                        <li ><samp >2039</samp></li>
    
                                        <li ><samp >2040</samp></li>
    
                                     </ul>
    
                                  </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >LastReferencedDate</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >dateTime</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Filter, Nillable, Sort</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd >The timestamp for when the current user last viewed a record
                                     related to this record.</dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >LastViewedDate</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >dateTime</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Filter, Nillable, Sort</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd >The timestamp for when the current user last viewed this record.
                                     If this value is null, this record might only have been referenced
                                     (LastReferencedDate) and not viewed. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >LongHaulMinimumDistance</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The minimum distance for a long-haul flight that’s adjusted
                                     according to the short-haul or medium-haul distances. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >MediumHaulMaximumDistance</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The maximum distance of a medium-haul flight. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >N2oPsgrKmLongHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The N2O emissions per passenger-kilometer in CO2e from long-haul
                                     flights. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >N2oPsgrKmMediumHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The N2O emissions per passenger-kilometer in CO2e from
                                     medium-haul flights.</dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >N2oPsgrKmShortHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The N2O emissions per passenger-kilometer in CO2e from short-haul
                                     flights.</dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >N2oPsgrMileLongHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The N2O emissions per passenger-mile in CO2e from long-haul
                                     flights. </dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >N2oPsgrMileMediumHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
    
                                  <dd >double</dd>
    
    
    
                                  <dt >Properties</dt>
    
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
    
    
    
                                  <dt >Description</dt>
    
                                  <dd > The N2O emissions per passenger-mile in CO2e from medium-haul
                                     flights.</dd>
    
    
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >N2oPsgrMileShortHaulInKgCo2e</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
                                  <dd >double</dd>
                                  <dt >Properties</dt>
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
                                  <dt >Description</dt>
    
                                  <dd > The N2O emissions per passenger-mile in CO2e from short-haul
                                     flights.</dd>
                            </dl>
    
                         </td>
    
                      </tr>
    
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >Name</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
    
                                  <dt >Type</dt>
                                  <dd >string</dd>
                                  <dt >Properties</dt>
                                  <dd >Create, Filter, Group, idLookup, Sort, Update</dd>
                                  <dt >Description</dt>
                                  <dd >Name of the account.</dd>
                            </dl>
                         </td>
                      </tr>
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >OwnerId</span></td>
    
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
                                  <dt >Type</dt>
                                  <dd >reference</dd>
                                  <dt >Properties</dt>
                                  <dd >Create, Defaulted on create, Filter, Group, Sort, Update</dd>
                                  <dt >Description</dt>
                                  <dd >The ID of the user who owns this record. </dd>
                                  <dd >This is a polymorphic relationship field.</dd>
                                  <dt >Relationship Name</dt>
                                  <dd >Owner</dd>
                                  <dt >Relationship Type</dt>
                                  <dd >Lookup</dd>
                                  <dt >Refers To</dt>
                                  <dd >Group, User</dd>
                            </dl>
                         </td>
                      </tr>
                      <tr>
                         <td  headers="d51659e96" data-title="Field"><span >ShortHaulMaximumDistance</span></td>
                         <td  headers="d51659e99" data-title="Details">
                            <dl >
                                  <dt >Type</dt>
                                  <dd >double</dd>
                                  <dt >Properties</dt>
                                  <dd >Create, Filter, Nillable, Sort, Update</dd>
                                  <dt >Description</dt>
                                  <dd > The maximum distance of a short-haul flight. </dd>
                            </dl>
                         </td>
                      </tr>
                   </tbody>
                    </table>
    
  • Related