I'm trying to make a web scraper using python. On my win10 system this works fine. But when I deploy to Heroku I get the following error. I don't know why it's not working.
Code :
from selenium import webdriver
import os
URL = "https://www.google.com" #URL
options = webdriver.ChromeOptions()
options.binary_location = os.environ.get("GOOGLE_CHROME_BIN")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=os.environ.get("CHROMEDRIVER_PATH"), options=options)
driver.get(URL)
Error :
2022-06-05T13:47:06.222744 00:00 app[worker.1]: /app/main.py:12: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
2022-06-05T13:47:06.222759 00:00 app[worker.1]: driver = webdriver.Chrome(executable_path=os.environ['CHROMEDRIVER_PATH'], options=options)
2022-06-05T13:47:06.815862 00:00 app[worker.1]: Traceback (most recent call last):
2022-06-05T13:47:06.815885 00:00 app[worker.1]: File "/app/main.py", line 12, in <module>
2022-06-05T13:47:06.816238 00:00 app[worker.1]: driver = webdriver.Chrome(executable_path=os.environ['CHROMEDRIVER_PATH'], options=options)
2022-06-05T13:47:06.816242 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py", line 70, in __init__
2022-06-05T13:47:06.816407 00:00 app[worker.1]: super(WebDriver, self).__init__(DesiredCapabilities.CHROME['browserName'], "goog",
2022-06-05T13:47:06.816422 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/chromium/webdriver.py", line 92, in __init__
2022-06-05T13:47:06.816570 00:00 app[worker.1]: RemoteWebDriver.__init__(
2022-06-05T13:47:06.816594 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 275, in __init__
2022-06-05T13:47:06.816794 00:00 app[worker.1]: self.start_session(capabilities, browser_profile)
2022-06-05T13:47:06.816809 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 365, in start_session
2022-06-05T13:47:06.817076 00:00 app[worker.1]: response = self.execute(Command.NEW_SESSION, parameters)
2022-06-05T13:47:06.817091 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 430, in execute
2022-06-05T13:47:06.817333 00:00 app[worker.1]: self.error_handler.check_response(response)
2022-06-05T13:47:06.817348 00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 247, in check_response
2022-06-05T13:47:06.817529 00:00 app[worker.1]: raise exception_class(message, screen, stacktrace)
2022-06-05T13:47:06.817616 00:00 app[worker.1]: selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: crashed.
2022-06-05T13:47:06.817616 00:00 app[worker.1]: (unknown error: DevToolsActivePort file doesn't exist)
2022-06-05T13:47:06.817617 00:00 app[worker.1]: (The process started from chrome location /app/.apt/opt/google/chrome/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Hope anyone is able to help.
Version :
- Python v3.9.13
- Selenium v4.2.0
- Heroku 20
CodePudding user response:
Solution
With selenium4
as the key executable_path
is deprecated you have to use an instance of the Service()
class along with ChromeDriverManager().install()
command, like below.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
options = Options()
options.add_argument("start-maximized")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options)
driver.get("https://www.google.com")