I am downloading files from s3 bucket and I want to place them in their respective folders.
with open(wrkbk, 'r') as csvfile:
datareader = csv.reader(csvfile)
for row in datareader:
stage, path = row[0],row[1]
fpath = os.path.join("C://Users//local1//Project101//Images//",stage)
your_bucket.download_file(path,fpath path)
What I want is the images go to
**Images->Stage-1->image1.jpg
Stage-2-> image2.jpg**
.
.
but what is happening is this.
Images-> Stage-1image1.jpg, Stage-2image2.jpg
I have already all the stage folder made inside the Image folder, so I didn't try the os.makedir()
it is not going inside the seperate folder. I tried doing fpath/path, but its saying string can't be divided.
and they are not going inside the folder, instead, the name is sticking along with it.
CodePudding user response:
You'll need to use os.path.join()
for the last segment too, so there's a slash in-between.
I also added os.path.basename()
, just in case the CSV file's path
also has a directory part...
base = "C:/Users/local1/Project101/Images"
with open(wrkbk) as csvfile:
datareader = csv.reader(csvfile)
for stage, path in datareader:
filename = os.path.basename(path)
full_path = os.path.join(base, stage, filename)
your_bucket.download_file(path,full_path)
CodePudding user response:
The function below downloads all files of a bucket path, to a local directory. The indication the path is a file - when it does not ends with "/" It will create the recursive directories of the root local directory you download to
import boto3
import os
s3 = boto3.resource('s3')
def download_s3_folder(bucket_name, s3_folder, local_dir=None):
bucket = s3.Bucket(bucket_name)
for obj in bucket.objects.filter(Prefix=s3_folder):
if obj.key[-1] == '/':
continue
target = os.path.join(local_dir, os.path.relpath(obj.key, s3_folder))
if not os.path.exists(os.path.dirname(target)):
os.makedirs(os.path.dirname(target))
print(f"downloading {obj.key} to {target}")
bucket.download_file(obj.key, target)
Usage example
download_s3_folder("MY_BUCKET", "PATH/Images/", "./Images")
Would download all content from path PATH/Images/
from MY_BUCKET
bucket to the current working directory you running the script from at directory ./Images
and will keep the files/paths structure
In order to download all of the bucket content while keeping the files/paths structure use -
download_s3_folder("MY_BUCKET", "", "./my-bucket-content")