on a AWS Glue Job, I'm using ftplib to download files and store them to S3, with the following code:
from ftplib import FTP
ftp = FTP()
ftp.connect("ftp.ser.ver", 21)
ftp.login("user", "password")
remotefile='filename.txt'
download='s3://bucket/folder/filename.txt'
with open(download,'wb') as file:
ftp.retrbinary('RETR %s' % remotefile, file.write)
And I got an error stated as follow:
FileNotFoundError: [Errno 2] No such file or directory
Ran the same code through local and changed the download path to local path and the code works. I'm fairly new to S3 and Glue and not sure where to look for right documentations. Any insight and suggestion is greatly appreciated.
CodePudding user response:
You can't download an FTP file and directly save it towards S3. You will have to use either a memory-based or file-based stream to save it in the glue environment before you could upload it to S3.
from boto3.session import Session
import boto3
from ftplib import FTP
ftp = FTP()
ftp.connect("ftp.ser.ver", 21)
ftp.login("user", "password")
with open("/tmp/filename.txt",'wb') as file:
ftp.retrbinary("filename.txt", file.write)
s3 = boto3.client('s3')
with open("/tmp/filename.txt", "rb") as f:
s3.upload_fileobj(f, "BUCKET_NAME", "OBJECT_NAME")