i am new to the terraform scripting, i want to create multiple glue jobs which containing different name and different script for each. is there any possibility to create this multiple jobs as one job with help of variables?
for example: variable.tf
variable "db2jobnames" {
description = "db2 glue job names"
type = list
default = ["sql_db_job", "sql_db_job2"]
}
variable "script_location" {
description = "db2 glue job scripts"
type = list
default = ["s3://s3_buget/sql_db_job.py", "s3://s3_buget/sql_db_job.py"]
}
glue-connection.tf
resource "aws_glue_connection" "conn_db2" {
count = var.created_CR ? 1 : 0
connection_type = "JDBC"
connection_properties = {
JDBC_CONNECTION_URL = "jdbc:db2://lkidjhyft:50000/ZXHAG006G"
PASSWORD = "acfg3"
USERNAME = "ndhygsf"
}
name = "${var.department}-${var.application}-connection"
physical_connection_requirements {
availability_zone = var.connection_availability_zone
security_group_id_list = data.aws_security_groups.AWS_Public_Services.ids
subnet_id = data.aws_subnet.selected.id
}
}
and my glue job. main.tf
resource "aws_glue_job" "etl_jobs" {
count = var.created_GL ? 1 : 0
count = "${length(var.db2jobnames)}"
count = "${length(var.script_location)}"
name = "${var.db2jobnames[count.index]}_db2etljobs"
role_arn = aws_iam_role.glue_role.arn
command {
python_version = var.python_version
script_location = "${var.script_location[count.index]}"
}
default_arguments = {
"--extra-jars" = "${var.JarDir}"
"--TempDir" = "${var.TempDir}"
"--class" = "GlueApp"
"--enable-continuous-cloudwatch-log" = "${var.enable-continuous-cloudwatch_log}"
"--enable-glue-datacatalog" = "${var.enable-glue-datacatalog}"
"--enable-metrics" = "${var.enable-metrics}"
"--enable-spark-ui" = "${var.enable-spark-ui}"
"--job-bookmark-option" = "${var.job-bookmark-option}"
"--job-language" = "python"
"--env" = "${var.paramregion}"
"--spark-event-logs-path" = "${var.sparkeventlogpath}"
}
execution_property {
max_concurrent_runs = var.max_concurrent_runs
}
connections = [
"${aws_glue_connection.conn_db2[count.index].name}"
]
glue_version = var.glue_version
max_retries = 0
worker_type = var.worker_type
number_of_workers = 20
timeout = 2880
tags = local.common_tags
}
i have tried to insert two counts but i am getting error. how could we create two jobs with one job. like one job needs to create with first dbname and first script location as showed below.
job1--> sql_db_job - s3://s3_buget/sql_db_job.py
job2--> sql_db_job2 - s3://s3_buget/sql_db_job2.py
any responses would be appreciated. thank you.
CodePudding user response:
Based on the variables and code you have provided, you would have to change count
so it uses the length of one of the lists. For example:
resource "aws_glue_job" "etl_jobs" {
count = var.created_GL ? length(var.db2jobnames) : 0
name = "${var.db2jobnames[count.index]}_db2etljobs"
role_arn = aws_iam_role.glue_role.arn
command {
python_version = var.python_version
script_location = var.script_location[count.index]
}
default_arguments = {
"--extra-jars" = var.JarDir
"--TempDir" = var.TempDir
"--class" = "GlueApp"
"--enable-continuous-cloudwatch-log" = var.enable-continuous-cloudwatch_log
"--enable-glue-datacatalog" = var.enable-glue-datacatalog
"--enable-metrics" = var.enable-metrics
"--enable-spark-ui" = var.enable-spark-ui
"--job-bookmark-option" = var.job-bookmark-option
"--job-language" = "python"
"--env" = var.paramregion
"--spark-event-logs-path" = var.sparkeventlogpath
}
execution_property {
max_concurrent_runs = var.max_concurrent_runs
}
connections = [
aws_glue_connection.conn_db2[0].name
]
glue_version = var.glue_version
max_retries = 0
worker_type = var.worker_type
number_of_workers = 20
timeout = 2880
tags = local.common_tags
}