I am trying to create a scraper using Selenium on AWS, so I used following code on my desktop
import boto3
from selenium.webdriver import Remote
from selenium.webdriver import DesiredCapabilities
devicefarm_client = boto3.client('devicefarm', region_name='us-west-2', verify=False)
testgrid_url_response = devicefarm_client.create_test_grid_url(projectArn="<insert_arn_here>", expiresInSeconds=120)
desired_capabilities = DesiredCapabilities.CHROME
desired_capabilities['platform'] = 'windows'
desired_capabilities['acceptInsecureCerts'] = True
# fails here
driver = Remote(testgrid_url_response['url'], desired_capabilities)
# unreachable code
driver.get('https://www.google.com')
driver.quit()
When I run this code on my Desktop, I'm able to successfully invoke Device Farm, but when I've tried to run identical code into an AWS Lambda function, I get the following error
"errorMessage": "<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: IP address mismatch, certificate is not valid for '54.185.155.213'. (_ssl.c:1129)>",
"errorType": "URLError",
"requestId": "8c064e3e-f362-441b-be0c-3bc00ad109ba",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 24, in lambda_handler\n driver = Remote(testgrid_url_response['url'], desired_capabilities)\n",
" File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py\", line 92, in __init__\n self.start_session(desired_capabilities, browser_profile)\n",
" File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py\", line 179, in start_session\n response = self.execute(Command.NEW_SESSION, capabilities)\n",
" File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py\", line 234, in execute\n response = self.command_executor.execute(driver_command, params)\n",
" File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/remote/remote_connection.py\", line 408, in execute\n return self._request(command_info[0], url, body=data)\n",
" File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/remote/remote_connection.py\", line 478, in _request\n resp = opener.open(request, timeout=self._timeout)\n",
" File \"/var/lang/lib/python3.9/urllib/request.py\", line 517, in open\n response = self._open(req, data)\n",
" File \"/var/lang/lib/python3.9/urllib/request.py\", line 534, in _open\n result = self._call_chain(self.handle_open, protocol, protocol \n",
" File \"/var/lang/lib/python3.9/urllib/request.py\", line 494, in _call_chain\n result = func(*args)\n",
" File \"/var/lang/lib/python3.9/urllib/request.py\", line 1389, in https_open\n return self.do_open(http.client.HTTPSConnection, req,\n",
" File \"/var/lang/lib/python3.9/urllib/request.py\", line 1349, in do_open\n raise URLError(err)\n"
]
}
This is in spite of setting verify=False
, and also placing the Lambda in the US-west-2 region, as well as setting desired_capabilities['acceptInsecureCerts'] = True
and attaching IAM Role Policies that gave full access to Device Farm. Have tried runtime environments of Python3.8 and Python3.9. I have also tried directly passing in my aws_access_key_id and aws_secret_access_key, with the same results.
CodePudding user response:
The answer, thanks to @Tobe E's suggestion, related to versions of Selenium. My requirements.txt file contained selenium==3.0.2, whereas my local installation contained selenium==3.141.0. After upgrading Selenium package and redeploying to AWS, the code worked as expected without SSL Certificate errors.