Home > Software design >  How to create multiple glue jobs as one glue job in terraform
How to create multiple glue jobs as one glue job in terraform

Time:11-23

i am new to the terraform scripting, i want to create multiple glue jobs which containing different name and different script for each. is there any possibility to create this multiple jobs as one job with help of variables?

for example: variable.tf

variable "db2jobnames" {
  description = "db2 glue job names"
  type        = list
  default     = ["sql_db_job", "sql_db_job2"]
}

variable "script_location" {
  description = "db2 glue job scripts"
  type        = list
  default     = ["s3://s3_buget/sql_db_job.py", "s3://s3_buget/sql_db_job.py"]
}

glue-connection.tf

resource "aws_glue_connection" "conn_db2" {
  count           = var.created_CR ? 1 : 0
  connection_type = "JDBC"
  connection_properties = {
    JDBC_CONNECTION_URL = "jdbc:db2://lkidjhyft:50000/ZXHAG006G"
    PASSWORD            = "acfg3"
    USERNAME            = "ndhygsf"
  }

  name = "${var.department}-${var.application}-connection"

  physical_connection_requirements {
    availability_zone      = var.connection_availability_zone
    security_group_id_list = data.aws_security_groups.AWS_Public_Services.ids
    subnet_id              = data.aws_subnet.selected.id
  }
}

and my glue job. main.tf

resource "aws_glue_job" "etl_jobs" {
  count    = var.created_GL ? 1 : 0
  count    = "${length(var.db2jobnames)}"
  count    = "${length(var.script_location)}"
  name     = "${var.db2jobnames[count.index]}_db2etljobs"
  role_arn = aws_iam_role.glue_role.arn

  command {
    python_version  = var.python_version
    script_location = "${var.script_location[count.index]}"
  }
  default_arguments = {
    "--extra-jars"                       = "${var.JarDir}"
    "--TempDir"                          = "${var.TempDir}"
    "--class"                            = "GlueApp"
    "--enable-continuous-cloudwatch-log" = "${var.enable-continuous-cloudwatch_log}"
    "--enable-glue-datacatalog"          = "${var.enable-glue-datacatalog}"
    "--enable-metrics"                   = "${var.enable-metrics}"
    "--enable-spark-ui"                  = "${var.enable-spark-ui}"
    "--job-bookmark-option"              = "${var.job-bookmark-option}"
    "--job-language"                     = "python"
    "--env"                              = "${var.paramregion}"
    "--spark-event-logs-path"            = "${var.sparkeventlogpath}"
  }
  execution_property {
    max_concurrent_runs = var.max_concurrent_runs
  }
  connections = [
    "${aws_glue_connection.conn_db2[count.index].name}"
  ]
  glue_version      = var.glue_version
  max_retries       = 0
  worker_type       = var.worker_type
  number_of_workers = 20
  timeout           = 2880
  tags              = local.common_tags
}

i have tried to insert two counts but i am getting error. how could we create two jobs with one job. like one job needs to create with first dbname and first script location as showed below.

job1--> sql_db_job - s3://s3_buget/sql_db_job.py
job2--> sql_db_job2 - s3://s3_buget/sql_db_job2.py

any responses would be appreciated. thank you.

CodePudding user response:

Based on the variables and code you have provided, you would have to change count so it uses the length of one of the lists. For example:

resource "aws_glue_job" "etl_jobs" {
  count    = var.created_GL ? length(var.db2jobnames) : 0
  name     = "${var.db2jobnames[count.index]}_db2etljobs"
  role_arn = aws_iam_role.glue_role.arn

  command {
    python_version  = var.python_version
    script_location = var.script_location[count.index]
  }
  default_arguments = {
    "--extra-jars"                       = var.JarDir
    "--TempDir"                          = var.TempDir
    "--class"                            = "GlueApp"
    "--enable-continuous-cloudwatch-log" = var.enable-continuous-cloudwatch_log
    "--enable-glue-datacatalog"          = var.enable-glue-datacatalog
    "--enable-metrics"                   = var.enable-metrics
    "--enable-spark-ui"                  = var.enable-spark-ui
    "--job-bookmark-option"              = var.job-bookmark-option
    "--job-language"                     = "python"
    "--env"                              = var.paramregion
    "--spark-event-logs-path"            = var.sparkeventlogpath
  }
  execution_property {
    max_concurrent_runs = var.max_concurrent_runs
  }
  connections = [
    aws_glue_connection.conn_db2[0].name
  ]
  glue_version      = var.glue_version
  max_retries       = 0
  worker_type       = var.worker_type
  number_of_workers = 20
  timeout           = 2880
  tags              = local.common_tags
}
  • Related