I can't figure out where the error is in this code. Basically it inserts a code in the search bar, clicks a button and extracts the results:
from seleniumwire import webdriver
import time
API_KEY = 'my_api_key'
proxy_options = {
'proxy': {
'https': f'http://scraperapi:{API_KEY}@proxy-server.scraperapi.com:8001',
'no_proxy': 'localhost,127.0.0.1'
}
}
url = 'https://www.ufficiocamerale.it/'
vats = ['06655971007', '05779661007', '08526440154']
for vat in vats:
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
driver.get(url)
time.sleep(5)
item = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//input[@id="search_input"]')
item.send_keys(vat)
time.sleep(1)
button = driver.find_element_by_xpath('//form[@id="formRicercaAzienda"]//p//button[@type="submit"]')
button.click()
time.sleep(5)
all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
for item in all_items:
if '@' in item.text:
print(item.text.split(' ')[1])
driver.close()
Running the script (chromedriver.exe is saved in the same folder and I'm working in Jupyter Notebook, if it matters) I get
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//form[@id="formRicercaAzienda"]//input[@id="search_input"]"}
but this element exists, because trying the script without ScraperAPI I get no errors. Can anyone figure out what the problem is?
CodePudding user response:
- Here you are running with a loop for 3
vat
values.
After the first click on the search button the result page is presented.
There is no search input field and search button there!
So, in order to perform a new search you need to get back to the previous page after getting the data on the result page. - There is no need to create a new instance of web driver each iteration.
- Also, you should use Expected Conditions explicit waits instead of hardcoded pauses.
This should work better:
from seleniumwire import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
API_KEY = 'my_api_key'
proxy_options = {
'proxy': {
'https': f'http://scraperapi:{API_KEY}@proxy-server.scraperapi.com:8001',
'no_proxy': 'localhost,127.0.0.1'
}
}
url = 'https://www.ufficiocamerale.it/'
vats = ['06655971007', '05779661007', '08526440154']
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
wait = WebDriverWait(driver, 20)
driver.get(url)
for vat in vats:
input_search = wait.until(EC.visibility_of_element_located((By.XPATH, '//form[@id="formRicercaAzienda"]//input[@id="search_input"]')))
input_search.clear()
input_search.send_keys(vat)
time.sleep(0.5)
wait.until(EC.visibility_of_element_located((By.XPATH, '//form[@id="formRicercaAzienda"]//p//button[@type="submit"]'))).click()
wait.until(EC.visibility_of_element_located((By.XPATH, '//ul[@id="first-group"]/li')))
time.sleep(0.5)
all_items = driver.find_elements_by_xpath('//ul[@id="first-group"]/li')
for item in all_items:
if '@' in item.text:
print(item.text.split(' ')[1])
driver.execute_script("window.history.go(-1)")
driver.close()
UPD
This code is working, the output is