Home > Software engineering >  Download multiple files from S3 bucket using boto3
Download multiple files from S3 bucket using boto3

Time:12-03

I have a csv file containing numerous uuids

I'd like to write a python script using boto3 which:

  1. Connects to an AWS S3 bucket
  2. Uses each uuid contained in the CSV to copy the file contained

Files are all contained in a filepath like this: BUCKET/ORG/FOLDER1/UUID/DATA/FILE.PNG However, the file contained in DATA/ can be different file types.

  1. Put the copied file in a new S3 bucket

So far, I have successfully connected to the s3 bucket and checked its contents in python using boto3, but need help implementing the rest

import boto3

#Create Session
session = boto3.Session(
    aws_access_key_id='ACCESS_KEY_ID',
    aws_secret_access_key='SECRET_ACCESS_KEY',
)

#Initiate S3 Resource
s3 = session.resource('s3')

                  
your_bucket = s3.Bucket('BUCKET-NAME')


for s3_file in your_bucket.objects.all():
    print(s3_file.key) # prints the contents of bucket

CodePudding user response:

To read the CSV file you can use csv library (see: https://docs.python.org/fr/3.6/library/csv.html) Example:

import csv
with open('file.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

To push files to the new bucket, you can use the copy method (see: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.copy)

Example:

import boto3
s3 = boto3.resource('s3')
source = {
      'Bucket': 'BUCKET-NAME',
      'Key': 'mykey'
}
bucket = s3.Bucket('SECOND_BUCKET-NAME')
bucket.copy(source, 'SECOND_BUCKET-NAME')
  • Related