Home > Back-end >  Getting connection error while web scrapping in AWS Lambda
Getting connection error while web scrapping in AWS Lambda

Time:08-18

I am using EFS to store python packages for Lambda and I have been running this simple code to check connection of the site

import json 
import sys
sys.path.append("/mnt/access")
import requests
from bs4 import BeautifulSoup

def lambda_handler(event, context):

     url = "http://www.wordhippo.com/what-is/another-word-for/credit"

     print(url)

     page = requests.get(url)
   # soup = BeautifulSoup(page.content, 'html.parser')
     print(page) 

This is the CloudWatch logs

[ERROR] ConnectionError: HTTPSConnectionPool(host='www.wordhippo.com', port=443): Max retries exceeded with url: /what-is/another-word-for/credit.html (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7eff5fa618e0>: Failed to establish a new connection: [Errno 110] Connection timed out')) Traceback (most recent call last):   File "/var/task/lambda_function.py", line 126, in lambda_handler     page = requests.get(url)   File "/mnt/access/requests/api.py", line 73, in get     return request("get", url, params=params, **kwargs)   File "/mnt/access/requests/api.py", line 59, in request     return session.request(method=method, url=url, **kwargs)   File "/mnt/access/requests/sessions.py", line 587, in request     resp = self.send(prep, **send_kwargs)   File "/mnt/access/requests/sessions.py", line 701, in send     r = adapter.send(request, **kwargs)   File "/mnt/access/requests/adapters.py", line 565, in send     raise ConnectionError(e, request=request)

CodePudding user response:

Lambda function in a VPC does not have internet access if its in public subnets. Default VPC has only public subnets.

The easiest way to enable internet access to your function is simply not placing it in a VPC.

  • Related