I am deploying my apps to AKS using Azure DevOps through Helm Charts.
Everything was working fine until my deployments have started failing all of a sudden.
In a deployment, 2 out of 3 or 1 out of 2 pods are able to pull the image from ACR & they are getting started. But the remaining pods give following error:
2m9s Warning Failed
pod/podname-5f6d498f6b-wzx5h Failed to pull image "acrname.azurecr.io/repo/image-name:1.0.73": rpc error: code = Unknown desc = failed to pull and unpack image "acrname.azurecr.io/repo/image-name:1.0.73": failed to resolve reference "acrname.azurecr.io/repo/image-name:1.0.73": failed to do request: Head "https://acrname.azurecr.io/v2/repo/image-name/manifests/1.0.73": dial tcp: lookup prd255acr01.azurecr.io on [::1]:53: read udp [::1]:34880->[::1]:53: read: connection refused
I can login to the ACR, AKS is able to pull the images from ACR & I can see the image pulled from ACR when I do docker images
How can I resolve this error? It is failing all of my deployments of all the apps.
CodePudding user response:
Issue is reflected here: https://bugs.launchpad.net/ubuntu/ source/systemd/ bug/1988119
rebooting pool nodes fixes issue.