Home > Software engineering >  AWS S3 Lifecycle Rule to cleanup Athena OutputLocation
AWS S3 Lifecycle Rule to cleanup Athena OutputLocation

Time:10-05

I am trying to setup a lifecycle rule to clean up my bloated athena OutputLocation folder and I need some clarification

  • Snippets of lifecycle rule currently setup below

  • Will this rule only apply to the folder athena-results/ in my bucket?

  • The rule actions are a little unclear to me, in terms of what to actually select? I want to delete any and all existing files in this location older than 1 day - files going back a few years as well as daily going forward. Is my current selection correct? Expire current version of objects, I assume this one takes care of all the historical files? Or do I also need to select the 5th option, Delete expired delete markers or incomplete multipart uploads?

  • For further context and what this OutputLocation folder is used for:

    $query = $client->startQueryExecution([ "QueryString" => $sql, "ResultConfiguration" => [ "OutputLocation" => "s3://s3location/athena-results" ] ]);

    $obj = $s3->getObject([ 'Bucket' => 'analytics', 'Key' => 'athena-results/'.$queryId.'.csv' ]);

Current rule

Current rule

Current rule

CodePudding user response:

Some of those options (eg delete markers) only apply if the bucket has Versioning activated.

Amazon Athena doesn't do multipart uploads or any storage class transitions, so those options are not needed.

Your options look good -- give it a try! It might take 24-48 hours for objects to start disappearing.

Let us know how it went for you!

  • Related