Home > Net >  Indicate a directory on Amazon's S3
Indicate a directory on Amazon's S3

Time:12-10

I'm new to AWS services. I've always used the code below to calculate NDVI for images that were located in a directory.

path = r'images'
dirContents = os.listdir(path)

for file in dirContents:
    if os.path.isdir(file):
        subDir = os.listdir(file)
        
        # Assuming only two files in each subdirectory, bands 4 and 8 here
        if "B04" in subDir[0]:
            band4 = rasterio.open(subDir[0])
            band8 = rasterio.open(subDir[1])
        else:
            band4 = rasterio.open(subDir[1])
            band8 = rasterio.open(subDir[0])

        red = band4.read(1).astype('float32')
        nir = band8.read(1).astype('float32')

        #compute the ndvi
        ndvi = (NIR.astype(float) - RED.astype(float)) / (NIR RED)

        profile = red.meta
        profile.update(driver='GTiff')
        profile.update(dtype=rasterio.float32)

        with rasterio.open(outfile, 'w', **profile) as dst:
            dst.write(ndvi.astype(rasterio.float32))

Now all the necessary images are in an amazon S3 folder. How do I replace the lines below?

path = r'images'
dirContents = os.listdir(path)

CodePudding user response:

Amazon S3 is not a filesystem. You will need to use different commands to:

  • List the contents of a bucket/path
  • Download the files to local storage
  • Then access the files on local storage

You can use the boto3 AWS SDK for Python to access objects stored in S3.

For example:

import boto3

s3_resource = boto3.resource('s3')

# List objects
objects = s3_resource.Bucket('your-bucket').objects.Filter(Prefix='images/')

# Loop through each object
for object in objects:
  s3_resource.Object(object.bucket_name, object.key).download_file('local_filename')
  # Do something with the file here

CodePudding user response:

If you are new to AWS, you may also consider libcloud library. This is a library that allows you to use different cloud solutions with a unified API. For storage solutions you could do (code from here):

from libcloud.storage.types import Provider
from libcloud.storage.providers import get_driver

client = driver(StoreProvider.S3)
s3 = client(aws_id, aws_secret)

container = s3.get_container(container_name='name')
objects = s3.list_container_objects(container, prefix='path')

# Download a file
s3.download_object(objects[0], '/path/to/download')

Some things to note:

  • Files are stored in an S3 bucket (container). Although buckets have a flat hierarchy, you are allowed to use key names like 'path/subpath/file1' to organize files in folders.
  • You need to authenticate access to the bucket. In the above code you do so by providing an id and a secret.
  • Related