I have folder with 1.5 millions of objects (about 5 TB of data) which has folders with the next format 123-John. I need to copy all these folders content in the new folders with renaming it to format 123. I want to do it by the means of java.
Obviously I can't just do it one by one like this:
ObjectListing objectListing = s3.listObjects(listObjectsRequest);
boolean processable = true;
while (processable) {
processable = objectListing.isTruncated();
renameAndCopyOneByOne(objectListing.getObjectSummaries()); // this edits name and makes call to s3.copyObject()
if (processable) {
objectListing = s3.listNextBatchOfObjects(objectListing);
}
}
it would lead to making about 1.5 millions calls to
s3.copyObject(bucket, sourceKey, bucket, destinationKey)
I wanted to do it with batch , but the thing is that it could be done only with creating of manifest file in CSV format with format like
bucketName,keyName
But this is just manifest for the objects I want to make action to. I can't list locations where to save to and specify edited folder name. And also I still have to split CSV with 1.5 millions into smaller ones and create several request to S3 to create several jobs which would be not obvious to track.
Could you please give me a hint what from AWS tools would perfectly suffice all my needs for this task?
CodePudding user response:
Well, after some time spent on how to do it properly I think the only way is to make such migration by some batch job from Java, to split the load. Because AWS does not have proper tool for my case.