I'm trying to scrape the webpage ted.europa.eu using Python with Selenium to retrieve information from the tenders. The script is supposed to be executed once a day with the new publications. The problem I have is that navigating to the new tenders I need Selenium to apply a filter to get only the ones from the same day the script it's executed. I already have the script for this and works perfectly, the problem is that when I activate the headless mode I get the following error selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable: [object HTMLInputElement] has no size and location
This is the code I have that applies the filter I need:
import sys
import time
import re
from datetime import datetime
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from dotenv import load_dotenv
load_dotenv("../../../../.env")
sys.path.append("../src")
sys.path.append("../../../../utils")
from driver import *
from lted import LTED
from runnable import *
# start
print('start...')
counter = 0
start = datetime.now()
# get driver
driver = get_driver_from_url("https://ted.europa.eu/TED/browse/browseByMap.do")
actions = ActionChains(driver)
# change language to spanish
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "lgId")))
driver.find_element(By.ID, "lgId").click()
driver.find_element(By.XPATH, "//select[@id='lgId']/option[text()='español (es)']").click()
# click on "Busqueda avanzada"
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "goToSearch")))
driver.find_element(By.ID, "goToSearch").click()
# accept cookies and close tab
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "cookie-consent-banner")))
driver.find_element(By.XPATH, "//div[@id='cookie-consent-banner']/div[1]/div[1]/div[2]/a[1]").click()
driver.find_element(By.XPATH, "//div[@id='cookie-consent-banner']/div[1]/div[1]/div[2]/a[1]").click()
# click on specific date and set to today
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "publicationDateSpecific")))
element = driver.find_element(By.ID, "publicationDateSpecific")
actions.move_to_element(element).perform()
driver.find_element(By.ID, "publicationDateSpecific").click()
driver.find_element(By.CLASS_NAME, "ui-state-highlight").click()
# click on search
driver.find_element(By.ID, "search").click()
From the imports the only think I need to explain is that from the line from dirver import *
comes the method get_driver_from_url()
that is used later in the code. This method looks like this:
def get_driver_from_url(url):
options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--start-maximized")
options.add_argument("--headless")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get(url)
return driver
As I said this code works perfectly without the headless mode, but when activated I get the error.
At first got another error and searching on the Internet found out that it could be because the element is not on screen, so I added the argument "--start-maximized"
to make sure the Chrome tab is as big as possible and added the ActionChains to use actions.move_to_element(element).perform()
, but I get this error on this exact code line.
Also tried changing the line WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "publicationDateSpecific")))
to WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, "publicationDateSpecific")))
but it just didn't work.
Update: Also tried changing to EC.visibility_of_element_located
as mentioned in this post but didn't work either
What am I doing wrong?
CodePudding user response:
This is probably because of the window size. Try adding this:
chrome_options = Options()
chrome_options.add_argument("--window-size=1920,1080")
chrome_options.add_argument("--start-maximized")
chrome_options.add_argument("--headless")
CodePudding user response:
So, after a long time of try and error, I found that adding
element = driver.find_element(By.ID, "publicationDateSpecific")
driver.execute_script("window.scrollTo(0," str(element.location["y"]) ")")
makes the script work both in headless mode and normal mode