Home > OS >  Upload multiple files to multiple S3 buckets in Terraform
Upload multiple files to multiple S3 buckets in Terraform

Time:10-27

I am very new to terraform. My requirement is to upload objects to existing s3 buckets. I want to upload one or more objects from my source to one or more buckets utilizing only one resource. Using count and count.index I can create different numbers of resources. However, doing so will prevent me from using fileset which helps to recursively upload all the contents in the folder. The basic code look like this. This is for multiple file uploads to single bucket but I would like to modify for multiple uploads to different buckets.;

variable "source_file_path"{
   type = list(string)
   description = "Path from where objects are to be uploaded"
}

variable "bucket_name"{
    type = list(string)
    description = "Name or ARN of the bucket to put the file in"
}

variable "data_folder"{
    type = list(string)
    description = "Object path inside the bucket"
}

resource "aws_s3_bucket_object" "upload_object"{
    for_each = fileset(var.source_file_path, "*")
    bucket = var.bucket_name
    key = "${var.data_folder}${each.value}"
    source = "${var.source_file_path}${each.value}"
 }

I have created a vars.tfvars file with following values;

source_file_path = ["source1","source2"]
bucket_name = ["bucket1","bucket2"]
data_folder = ["path1","path2"]

So, what I need is, terraform to be able to upload all the files from the source1 to bucket1 s3 bucket by creating path1 inside the bucket. And similarly for source2, bucket2, and path2.

Is this something that can be done in terraform?

CodePudding user response:

From your problem description it sounds like a more intuitive data structure to describe what you want to create would be a map of objects where the keys are bucket names and the values describe the settings for that bucket:

variable "buckets" {
  type = map(object({
    source_file_path = string
    key_prefix       = string
  }))
}

When defining the buckets in your .tfvars file this will now appear as a single definition with a complex type:

buckets = {
  bucket1 = {
    source_file_path = "source1"
    key_prefix       = "path1"
  }
  bucket2 = {
    source_file_path = "source2"
    key_prefix       = "path2"
  }
}

This data structure has one element for each bucket, so it is suitable to use directly as the for_each for a resource describing the buckets:

resource "aws_s3_bucket" "example" {
  for_each = each.buckets

  bucket = each.key
  # ...
}

There is a pre-existing official module hashicorp/dir/template which already encapsulates the work of finding files under a directory prefix, assigning each one a Content-Type based on its filename suffix, and optionally rendering templates. (You can ignore the template feature if you don't need it, by making your directory only contain static files.)

We need one instance of that module per bucket, because each bucket will have its own directory and thus its own set of files, and so we can use for_each chaining to tell Terraform that each instance of this module is related to one bucket:

module "bucket_files" {
  for_each = aws_s3_bucket.example

  base_dir = var.buckets[each.key].source_file_path
}

The module documentation shows how to map the result of the module to S3 bucket objects, but that example is for only a single instance of the module. In your case we need an extra step to turn this into a single collection of files across all buckets, which we can do using flatten:

locals {
  bucket_files_flat = flatten([
    for bucket_name, files_module in module.bucket_files : [
      for file_key, file in files_module.files : {
        bucket_name  = bucket_name
        local_key    = file_key
        remote_key   = "${var.buckets[each.key].key_prefix}${file_key}"
        source_path  = file.source_path
        content      = file.content
        content_type = file.content_type
        etag         = file.digests.md5
      }
    ]
  ])
}

resource "aws_s3_bucket_object" "example" {
  for_each = {
    for bf in local.bucket_files_flat :
    "s3://${bf.bucket_name}/${bf.remote_key}" => bf
  }

  # Now the rest of this is basically the same as
  # the hashicorp/dir/template S3 example, but using
  # the local.bucket_files_flat structure instead
  # of the module result directly.

  bucket       = each.value.bucket_name
  key          = each.value.remote_key
  content_type = each.value.content_type

  # The template_files module guarantees that only one of these two attributes
  # will be set for each file, depending on whether it is an in-memory template
  # rendering result or a static file on disk.
  source  = each.value.source_path
  content = each.value.content

  # Unless the bucket has encryption enabled, the ETag of each object is an
  # MD5 hash of that object.
  etag = each.value.etag
}

Terraform needs a unique tracking key for each instance of aws_s3_bucket_object.example, and so I just arbitrarily decided to use the s3:// URI convention here, since I expect that's familiar to folks accustomed to working with S3. This means that the resource block will declare instances with addresses like this:

  • aws_s3_bucket_object.example["s3://bucket1/path1example.txt"]
  • aws_s3_bucket_object.example["s3://bucket2/path2other_example.txt"]

Because these objects are uniquely identified by their final location in S3, Terraform will understand changes to the files as updates in-place, but any changes to the location as removing an existing object and adding a new one at the same time.

(I replicated the fact that your example just concatenated the path prefix with the filename without any intermediate separator, and so that's why it appears as path1example.txt above and not path1/example.txt. If you want the slash in there, you can add it to the expression which defined remote_key inside local.bucket_files_flat.)

  • Related