Home > other >  I'm trying to scrap a table web-site with different table option values
I'm trying to scrap a table web-site with different table option values

Time:08-06

I'm trying to scrap a table web-sites (an example is in this url: enter image description here

The currently code implementing are:

test = {}
dict_scr = {}
for ii in range (0,12):
    options = webdriver.FirefoxOptions()
    options.binary_location = r'C://Local/Mozilla Firefox/firefox.exe'
    driver = selenium.webdriver.Firefox(executable_path='C:/\geckodriver.exe' , options=options)
    driver.execute("get", {'url': link_scr['Links'][ii]})
    test[link_scr.index[ii]] = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "table#current_holdings_table"))).get_attribute("outerHTML")
    dict_scr[link_scr.index[ii]]  = pd.read_html(test[link_scr.index[ii]])
    print(test[link_scr.index[ii]])

How i can update my code for including the option value required?

Can You help me with this challenge?

Thanks in advance

CodePudding user response:

To select the <option> with text as Q1 2022 13F Filings you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:

  • Using CSS_SELECTOR:

    driver.execute("get", {'url': 'https://whalewisdom.com/filer/berkshire-hathaway-inc#tabholdings_tab_link'})
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.dfwid-close"))).click()
    driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "label#qtr-1-label"))))
    Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "select#quarter_one")))).select_by_value('85')
    
  • Using XPATH:

    driver.execute("get", {'url': 'https://whalewisdom.com/filer/berkshire-hathaway-inc#tabholdings_tab_link'})
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='dfwid-close']"))).click()
    driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//label[@id='qtr-1-label']"))))
    Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//select[@id='quarter_one']")))).select_by_value('85')
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import Select
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Browser snapshot:

Quarter to view

  • Related