There are files from an AWS s3 bucket that I would like to download, they all have the same name but are in different subfolders. There are no credentials required to download and connect to this bucket. I would like to download all the files called "B01.tif
" in s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/
, and save them with the name of the subfolder they are in (for example: S2A_7VEG_20170205_0_L2AB01.tif
).
Path example:
s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/2017/2/S2A_7VEG_20170205_0_L2A/B01.tif
I was thinking of using a bash script that prints the output of ls to download the file with cp, and save it on my pc with a name generated from the path.
Command to use ls:
aws s3 ls s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/2017/2/ --no-sign-request
Command to download a single file:
aws s3 cp s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/2017/2/S2A_7VEG_20170205_0_L2A/B01.tif --no-sign-request B01.tif
Attempt to download multiple files:
VAR1=B01.tif
for a in s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/:
for b in s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/2017/:
for c in s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/2017/2/:
NAME=$(aws s3 ls s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/$a$b$c | head -1)
aws s3 cp s3://sentinel-cogs/sentinel-s2-l2a-cogs/7/V/EG/$NAME/B01.tif --no-sign-request $NAME$VAR1
done
done
done
I don't know if there is a simple way to go automatically through every subfolder and save the files directly. I know my ls command is broken, because if there are multiple subfolders it will only take the first one as a variable.
CodePudding user response:
It's easier to do this in a programming language rather than as a Shell script.
Here's a Python script that will do it for you:
import boto3
BUCKET = 'sentinel-cogs'
PREFIX = 'sentinel-s2-l2a-cogs/7/V/EG/'
FILE='B01.tif'
s3_resource = boto3.resource('s3')
for object in s3_resource.Bucket(BUCKET).objects.filter(Prefix=PREFIX):
if object.key.endswith(FILE):
target = object.key[len(PREFIX):].replace('/', '_')
object.Object().download_file(target)