This is my complete newb intro to Boto and AWS. At the moment my only goal is to be able to access an external agency's S3 bucket, so I want to understand how to use them in particular. This minimal code does what I want it to do, but I haven't figured out how to use this by only declaring s3r as a resource and avoiding having to use s3 as a client. It seems like it would be better to access the bucket from the S3 resource and then work exclusively with the bucket, i.e., bucket.new_key('testdir/')
or bucket.put_object(Key=('testdir/))
. Is this possible, or is there alternately a good reason to reframe how I'm approaching this? Thanks!
import boto3
bucket_name = 'my-bucket-name'
region_name = 'my-region-name'
print('Acquiring s3 service')
s3 = boto3.client('s3', region_name=region_name)
s3r = boto3.resource('s3', region_name=region_name)OB
print('Accessing bucket')
bucket = s3r.Bucket(bucket_name)
print('Emptying bucket')
bucket.objects.all().delete()
print('Uploading folder structures')
s3.put_object(Bucket=bucket_name, Key=('testdir/'))
s3.put_object(Bucket=bucket_name, Key=('testdir/subdir1/'))
s3.put_object(Bucket=bucket_name, Key=('testdir/subdir2/'))
CodePudding user response:
The boto3 API provides both a 'client' and 'resource' object model for most of the AWS APIs. The documentation has this to say on the difference:
Resources represent an object-oriented interface to Amazon Web Services (AWS). They provide a higher-level abstraction than the raw, low-level calls made by service clients
In other words, the 'client' APIs are a fairly one to one wrapper over the underlying AWS REST calls. The 'resource' API calls are meant to be easier to use, and they provide some "quality of life" improvements that make writing code quicker. Which one to use largely comes down to a coding style preference. For the most part what you can accomplish with 'client' calls can also be accomplished with 'resource' calls. Not always, though. Certainly, for your example, it's possible in either case:
s3 = boto3.client('s3')
# List all of the objects in a bucket, note that since we're fairly
# close to the underlying REST API with the client interface, we need
# to worry about paginating the list objects
paginator = s3.get_paginator('list_objects_v2')
for page in paginator.paginate(Bucket=bucket_name):
for cur in page.get('Contents', []):
# And delete each object in turn
s3.delete_object(Bucket=bucket_name, Key=cur['Key'])
# Create a zero-byte object to represent the folder
s3.put_object(Bucket=bucket_name, Key='testdir/')
The same work can be accomplished with the resource interface
s3r = boto3.resource('s3')
# Same idea with resource
bucket = s3r.Bucket(bucket_name)
# Paginating, and calling delete on each object in turn is handled
# behind the scenes by all() and delete() in turn
bucket.objects.all().delete()
# Creating the object, again make a zero-byte object to mimic creating
# a folder as the S3 Web UI does
bucket.put_object(Key='testdir/')
Again, it comes down to personal preferences. I personally prefer using the client interface, since it makes it easier to understand and track which underlying API calls are being made, but it's really up to you.