I'm new to AWS and Shapefiles so bear with me.
I'm currently trying to read a shape file from Amazon S3 onto an RDS Postgres database. I'm struggling to read the file as it throws me a memory error. What I've tried so far:
from io import StringIO, BytesIO
from zipfile import ZipFile
from urllib.request import urlopen
import shapefile
import geopandas as gpd
from shapely.geometry import shape
import pandas as pd
import requests
import boto3
def lambda_handler(event,context):
"""Accessing the S3 buckets using boto3 client"""
s3_client =boto3.client('s3')
s3_bucket_name='myshapeawsbucket'
s3 = boto3.resource('s3',aws_access_key_id="my_id", aws_secret_access_key="my_key")
my_bucket=s3.Bucket(s3_bucket_name)
bucket_list = []
for file in my_bucket.objects.filter():
print(file.key)
bucket_list.append(file.key)
for file in bucket_list:
obj = s3.Object(s3_bucket_name,file)
data=obj.get()['Body'].read()
return {
'message':"Success!"
}
As soon as the code tries to execute obj.get()['Body'].read()
i get the following error:
Response
{
"errorMessage": "",
"errorType": "MemoryError",
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 27, in lambda_handler\n data=obj.get()['Body'].read()\n",
" File \"/var/runtime/botocore/response.py\", line 82, in read\n chunk = self._raw_stream.read(amt)\n",
" File \"/opt/python/urllib3/response.py\", line 518, in read\n data = self._fp.read() if not fp_closed else b\"\"\n",
" File \"/var/lang/lib/python3.8/http/client.py\", line 472, in read\n s = self._safe_read(self.length)\n",
" File \"/var/lang/lib/python3.8/http/client.py\", line 613, in _safe_read\n data = self.fp.read(amt)\n"
]
}
Function Logs
START RequestId: 7b70c331-ad7a-4964-b841-91da345b5174 Version: $LATEST
Roads_shp.zip
[ERROR] MemoryError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 27, in lambda_handler
data=obj.get()['Body'].read()
File "/var/runtime/botocore/response.py", line 82, in read
chunk = self._raw_stream.read(amt)
File "/opt/python/urllib3/response.py", line 518, in read
data = self._fp.read() if not fp_closed else b""
File "/var/lang/lib/python3.8/http/client.py", line 472, in read
s = self._safe_read(self.length)
File "/var/lang/lib/python3.8/http/client.py", line 613, in _safe_read
data = self.fp.read(amt)END RequestId: 7b70c331-ad7a-4964-b841-91da345b5174
REPORT RequestId: 7b70c331-ad7a-4964-b841-91da345b5174 Duration: 3980.11 ms Billed Duration: 3981 ms Memory Size: 128 MB Max Memory Used: 128 MB Init Duration: 2334.01 ms
Request ID
7b70c331-ad7a-4964-b841-91da345b5174
I'm following the tutorial off here: Reading Shapefiles from a URL into GeoPandas
I have looked into the issues but couldn't find an answer specific to ShapeFiles.
Links I looked at:
- MemoryError when Using the read() Method in Reading a Large Size of JSON file from Amazon S3
- Importing Large Size of Zipped JSON File from Amazon S3 into AWS RDS-PostgreSQL Using Python
CodePudding user response:
I have fixed the problem myself. The problem does not have anything to do with the logic but AWS's RAM limit set for lambdas. https://aws.amazon.com/lambda/pricing/?icmpid=docs_console_unmapped. Increase the memory from the default 128 MB to whatever you'd like