I'm using boto3
for accessing Google Cloud Storage
through S3
API.
Most of the operations work well, but I can't perform any batch actions due to exception:
botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the DeleteObjects operation: Invalid argument.
Below I present source code which works on AWS
, but not on GCS
.
I can perform actions on GCS
objects one by one, but batch doesn't work.
import boto3
BUCKET_NAME = "some-bucket"
def play_with_boto(s3_client):
bucket = s3_client.Bucket(BUCKET_NAME)
# upload file
with open("holbein.jpg", "rb") as f1, open("bruegel.jpeg", "rb") as f2:
bucket.put_object(Body=f1, Key="img/holbein.jpg")
bucket.put_object(Body=f2, Key="img/bruegel.jpg")
for obj in bucket.objects.filter(Prefix="img"):
print(obj)
# obj.delete() # non-optimal workaround for `GCS`
# this code breaks for `GCS`
bucket.objects.filter(Prefix="img").delete()
# even without filter it doesn't work
# bucket.objects.delete()
if __name__ == "__main__":
aws_s3 = boto3.resource(
service_name='s3',
aws_access_key_id="...",
aws_secret_access_key="...",
)
gcs_s3 = boto3.resource(
service_name='s3',
aws_access_key_id="...",
aws_secret_access_key="...",
endpoint_url="https://storage.googleapis.com/",
)
print("AWS")
play_with_boto(s3_client=aws_s3)
print("GCS")
play_with_boto(s3_client=gcs_s3)
And this is the output:
AWS
s3.ObjectSummary(bucket_name='some-bucket', key='img/bruegel.jpg')
s3.ObjectSummary(bucket_name='some-bucket', key='img/holbein.jpg')
GCS
s3.ObjectSummary(bucket_name='some-bucket', key='img/bruegel.jpg')
s3.ObjectSummary(bucket_name='some-bucket', key='img/holbein.jpg')
Traceback (most recent call last):
File "/home/jakub/Projects/neptune-client/alpha_integration_dev/tmp.py", line 39, in <module>
play_with_boto(s3_client=gcs_s3)
File "/home/jakub/Projects/neptune-client/alpha_integration_dev/tmp.py", line 19, in play_with_boto
bucket.objects.filter(Prefix="img").delete()
File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/boto3/resources/collection.py", line 515, in batch_action
return action(self, *args, **kwargs)
File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/boto3/resources/action.py", line 152, in __call__
response = getattr(client, operation_name)(*args, **params)
File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/botocore/client.py", line 386, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/botocore/client.py", line 705, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the DeleteObjects operation: Invalid argument.
Process finished with exit code 1
Is there any way to perform batch action on GCS
using boto3
?
CodePudding user response:
S3's multi-object delete API, which Google Cloud Storage does not support. Thus, it is not possible to do it this way for Google Cloud Storage - you will need to call delete_key () once per key.