I'm using a launch template for my node group and I'm getting an error saying NodeCreationFailure: Instances failed to join the kubernetes cluster
It seems that my issue is I need a bootstrap script to join the nodes to the cluster. This is my user data script, which is an example I found from this doc
linux_user_data.tpl Script attempt #1
#!/bin/bash
set -e
${pre_bootstrap_user_data ~}
export SERVICE_IPV4_CIDR=${cluster_service_ipv4_cidr}
B64_CLUSTER_CA=${cluster_auth_base64}
API_SERVER_URL=${cluster_endpoint}
/etc/eks/bootstrap.sh ${cluster_name} ${bootstrap_extra_args} --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL
${post_bootstrap_user_data ~}
Script attempt #2
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh mtc-cluster
Script attempt #3
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh --apiserver-endpoint ${aws_eks_cluster.eks.endpoint} --b64-cluster-ca ${aws_eks_cluster.eks.certificate_authority}
Here's my launch template in Terraform that's supposed to use this user data script
resource "aws_launch_template" "node" {
image_id = var.image_id
instance_type = var.instance_type
key_name = var.key_name
name = var.name
user_data = base64encode("linux_user_data.tpl")
block_device_mappings {
device_name = "/dev/sda1"
ebs {
volume_size = 20
}
}
}
Here's my node group resource block as well
resource "aws_eks_node_group" "nodes_eks" {
cluster_name = aws_eks_cluster.eks.name
node_group_name = "eks-node-group"
node_role_arn = aws_iam_role.eks_nodes.arn
subnet_ids = module.vpc.private_subnets
scaling_config {
desired_size = 3
max_size = 6
min_size = 3
}
ami_type = "CUSTOM"
capacity_type = "ON_DEMAND"
force_update_version = false
launch_template {
id = aws_launch_template.node.id
version = aws_launch_template.node.default_version
}
depends_on = [
aws_iam_role_policy_attachment.amazon_eks_worker_node_policy,
aws_iam_role_policy_attachment.amazon_eks_cni_policy,
aws_iam_role_policy_attachment.amazon_ec2_container_registry_read_only,
]
}
CodePudding user response:
Based on the code posted in the question and the documentation, the second example should work. However, I think that the file extension is what might be tripping up the nodes joining the cluster. The scripts defined in the first and third attempt require using the templatefile
function [1] as the values cannot be provided to the script otherwise, even with interpolation defined properly as in
/etc/eks/bootstrap.sh --apiserver-endpoint ${aws_eks_cluster.eks.endpoint} --b64-cluster-ca ${aws_eks_cluster.eks.certificate_authority}
The documentation [2] says that there are a couple of possibilities when providing the configuration to the nodes. Since you are using Amazon Linux 2, you should follow the explanation in [3]. In this case, you can use the templatefile
function with the following template (let's use the name you already have, linux_user_data.tpl
):
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="==MYBOUNDARY=="
--==MYBOUNDARY==
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash
set -ex
/etc/eks/bootstrap.sh ${cluster_name} \
--container-runtime containerd
--==MYBOUNDARY==--
Since K8s version 1.24
is abandoning Docker as container runtime, the --container-runtime
is set to containerd
. With regard to functionality, there are no differences but this way (if you are not deploying 1.24) it will ease the transition. In the aws_launch_template
resource, you would then have to change the following:
resource "aws_launch_template" "node" {
image_id = var.image_id
instance_type = var.instance_type
key_name = var.key_name
name = var.name
user_data = base64encode(templatefile("${path.root}/linux_user_data.tpl", {
cluster_name = aws_eks_cluster.eks.name
}))
block_device_mappings {
device_name = "/dev/sda1"
ebs {
volume_size = 20
}
}
}
The documentation says the only required argument is the cluster name, so try with this and update accordingly if needed. Also note the path.root
usage, which means the templatefile
will look for a file in the same directory where the root of the module is. If you were to modularize the code, then you could switch to path.module
. More information about path
can be found in [4].
[1] https://www.terraform.io/language/functions/templatefile
[2] https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-user-data
[3] https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html#launch-template-custom-ami
[4] https://www.terraform.io/language/expressions/references#filesystem-and-workspace-info