I'm facing a weird behaviour on azure. Im provisioning 6 VMs in a single availability set via terraform. As soon as the 6 VMs are ready, I see no errors in terraform and the server pops up in azure portal, one of the VMs gets deleted automatically. State goes from Creating to Deleting in activity log.
I'm using an availability set with default settings. In default settings the number of update domains is set to 5. Does this setting forces my new VMs (more than 5) to get deleted?
After a few retries the 6th VM stayed in azure. I did not change anything in the deployment process, just simply retrying the terraform run. Unfortunately I cant test with settings the update domain to 6, since I'm scaling up an already existing cluster. Updating the # of update domains forces VM recreation, which I dont want in production.
Im using the providers:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "=2.97.0"
}
azuread = {
source = "hashicorp/azuread"
version = "=2.18.0"
}
}
}
Terraform version:
Tomass-MacBook-Pro:ansible thomas$ terraform --version
Terraform v1.1.7
on darwin_amd64
And I'm creating an avalibality set with 6 nodes in it like this:
resource "azurerm_availability_set" "k8s-master-as" {
name = "azure-${lower(replace(var.environment_name, "_", "-"))}-masters-availabily-set"
resource_group_name = var.resource_group.name
location = var.resource_group.location
}
resource "azurerm_linux_virtual_machine" "k8s-master" {
count = var.master_node_count
Any ideas, why the 6th VM most of the time gets deleted when provisioned with terraform?
Thanks
CodePudding user response:
I'm using an availability set with default settings. In default settings the number of update domains is set to 5. Does this setting forces my new VMs (more than 5) to get deleted?
When more than five virtual machines are configured within a single availability set with five update domains, the sixth virtual machine is placed into the same update domain as the first virtual machine, the seventh in the same update domain as the second virtual machine, and so on.VMs are assigned sequentially in the update domains and fault domain.This mighe be the reason your 6th virtual machine is treated as first and first already created and got deleted.
In Azure Resource Manager(ARM) portal, we have three Fault domains and 5 update domains but we can upgrade our update domains from 5 to 20.
Updating up the
update domains
to 6 forces VM recreation, which I dont want in production.
updating up the update domain to 6, Virtual machines get update domains automatically once they are put inside availability set and All virtual machines within that update domain will reboot together. Scalling up virtual machines machine in availability set won't possible without recreating the all virtual machines in availability set because you can only assign a virtual machines to the availability set only during the creation of virtual machines. So your all virtual machine in availibity set recreate it again