Home > Blockchain >  Batch action with boto3 on GCS service
Batch action with boto3 on GCS service

Time:10-29

I'm using boto3 for accessing Google Cloud Storage through S3 API.

Most of the operations work well, but I can't perform any batch actions due to exception:

botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the DeleteObjects operation: Invalid argument.


Below I present source code which works on AWS, but not on GCS. I can perform actions on GCS objects one by one, but batch doesn't work.

import boto3

BUCKET_NAME = "some-bucket"


def play_with_boto(s3_client):
    bucket = s3_client.Bucket(BUCKET_NAME)

    # upload file
    with open("holbein.jpg", "rb") as f1, open("bruegel.jpeg", "rb") as f2:
        bucket.put_object(Body=f1, Key="img/holbein.jpg")
        bucket.put_object(Body=f2, Key="img/bruegel.jpg")

    for obj in bucket.objects.filter(Prefix="img"):
        print(obj)
        # obj.delete() # non-optimal workaround for `GCS`

    # this code breaks for `GCS`
    bucket.objects.filter(Prefix="img").delete()
    # even without filter it doesn't work
    # bucket.objects.delete()


if __name__ == "__main__":
    aws_s3 = boto3.resource(
        service_name='s3',
        aws_access_key_id="...",
        aws_secret_access_key="...",
    )
    gcs_s3 = boto3.resource(
        service_name='s3',
        aws_access_key_id="...",
        aws_secret_access_key="...",
        endpoint_url="https://storage.googleapis.com/",
    )
    print("AWS")
    play_with_boto(s3_client=aws_s3)
    print("GCS")
    play_with_boto(s3_client=gcs_s3)

And this is the output:

AWS
s3.ObjectSummary(bucket_name='some-bucket', key='img/bruegel.jpg')
s3.ObjectSummary(bucket_name='some-bucket', key='img/holbein.jpg')
GCS
s3.ObjectSummary(bucket_name='some-bucket', key='img/bruegel.jpg')
s3.ObjectSummary(bucket_name='some-bucket', key='img/holbein.jpg')
Traceback (most recent call last):
  File "/home/jakub/Projects/neptune-client/alpha_integration_dev/tmp.py", line 39, in <module>
    play_with_boto(s3_client=gcs_s3)
  File "/home/jakub/Projects/neptune-client/alpha_integration_dev/tmp.py", line 19, in play_with_boto
    bucket.objects.filter(Prefix="img").delete()
  File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/boto3/resources/collection.py", line 515, in batch_action
    return action(self, *args, **kwargs)
  File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/boto3/resources/action.py", line 152, in __call__
    response = getattr(client, operation_name)(*args, **params)
  File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/botocore/client.py", line 386, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/jakub/.venv/neptune-client/lib/python3.8/site-packages/botocore/client.py", line 705, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidArgument) when calling the DeleteObjects operation: Invalid argument.

Process finished with exit code 1

Is there any way to perform batch action on GCS using boto3?

CodePudding user response:

S3's multi-object delete API, which Google Cloud Storage does not support. Thus, it is not possible to do it this way for Google Cloud Storage - you will need to call delete_key () once per key.

  • Related