I'm trying to scrap a table web-sites (an example is in this url:
The currently code implementing are:
test = {}
dict_scr = {}
for ii in range (0,12):
options = webdriver.FirefoxOptions()
options.binary_location = r'C://Local/Mozilla Firefox/firefox.exe'
driver = selenium.webdriver.Firefox(executable_path='C:/\geckodriver.exe' , options=options)
driver.execute("get", {'url': link_scr['Links'][ii]})
test[link_scr.index[ii]] = WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "table#current_holdings_table"))).get_attribute("outerHTML")
dict_scr[link_scr.index[ii]] = pd.read_html(test[link_scr.index[ii]])
print(test[link_scr.index[ii]])
How i can update my code for including the option value required?
Can You help me with this challenge?
Thanks in advance
CodePudding user response:
To select the <option>
with text as Q1 2022 13F Filings you need to induce WebDriverWait for the element_to_be_clickable() and you can use either of the following locator strategies:
Using CSS_SELECTOR:
driver.execute("get", {'url': 'https://whalewisdom.com/filer/berkshire-hathaway-inc#tabholdings_tab_link'}) WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.dfwid-close"))).click() driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "label#qtr-1-label")))) Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "select#quarter_one")))).select_by_value('85')
Using XPATH:
driver.execute("get", {'url': 'https://whalewisdom.com/filer/berkshire-hathaway-inc#tabholdings_tab_link'}) WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//a[@class='dfwid-close']"))).click() driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//label[@id='qtr-1-label']")))) Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//select[@id='quarter_one']")))).select_by_value('85')
Note : You have to add the following imports :
from selenium.webdriver.support.ui import Select from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
Browser snapshot: