My (GCE) Load Balancer health checks are failing with a connection refused
error, ultimately marking my GCE Ingress as UNHEALTHY
.
Now I'm wondering how to fix this issue.
For my setup I'm using a GKE Autopilot cluster. And I have teared down and restarted my setup several times, always leading to the same result.
Suppose I have a deployment configured with a pod template consisting of several containers (of which not all of them expose ports).
Side Note: For simplicity I skipped some configurations, such as config maps.
apiVersion: apps/v1
kind: Deployment
metadata:
name: some-app-deployment
spec:
selector:
matchLabels:
app: some-app
replicas: 2
template:
metadata:
labels:
app: some-app
spec:
restartPolicy: Always
containers:
- name: web-server
image: {{some-app-image}}
command: ['app', 'web-server']
ports:
- name: web-server
containerPort: 5000
protocol: TCP
- name: admin-server
image: {{some-app-image}}
command: ['app', 'admin-server']
ports:
- name: admin-server
containerPort: 5001
protocol: TCP
- name: worker
image: {{some-app-image}}
command: ['app', 'worker']
- name: cron
image: {{some-app-image}}
command: ['app', 'cron']
- name: helper
image: {{some-app-image}}
command: [ '/bin/bash', '-c', '--' ]
args: [ 'while true; do sleep 30; done;' ]
The following is the BackendConfig CRD, which supposedly defines the health check.
I chose to define the path as /favicon.ico
because Load Balancer Health Check requires exactly 200 OK
response and the web-servers base path /
emits a redirect 302
, hence it would faild the Health Check.
With kubectl port-forward
I confirmed that /favicon.ico
actually emits 200 OK
and it does.
Btw. just to exclude this from being a problem, I also tried other paths without success.
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: http-hc-config
spec:
healthCheck:
type: HTTP
port: 5000
requestPath: "/favicon.ico"
checkIntervalSec: 20
Additionally there is a custom header added to the admin
endpoint.
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: x-header-config
spec:
customRequestHeaders:
headers:
- "X-Client-Region:{client_region}"
The following is the service description. It references the BackendConfig via annotations as per the documentation. The documentation is not very specific as to how to reference the health check, so I mapped it to the relevant port.
apiVersion: v1
kind: Service
metadata:
name: some-app-service
annotations:
cloud.google.com/backend-config: '{
"default":"http-hc-config",
"ports":{"4001":"x-header-config"}}'
spec:
type: NodePort
selector:
app: some-app
ports:
- name: web
targetPort: 5000
protocol: TCP
port: 4000
- name: admin
targetPort: 5001
protocol: TCP
port: 4001
I confirmed that my service is running, as I could successfully kubectl port-forward
the pods and access the web-servers content.
Now the final piece of the setup is this ingress object.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress
annotations:
kubernetes.io/ingress.class: gce
spec:
rules:
- host: admin.example.com
http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4001
- http:
paths:
- path: /*
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4000
EDIT #1 & #2:
Performing a gcloud compute health-checks describe {{HEALTH_DEF_ID}}
I receive the following output for the health check on port 4001
, which seems to be out-of-sync with the defined BackendConfig CRD:
checkIntervalSec: 15
creationTimestamp: '2022-02-07T11:30:50.059-08:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
httpHealthCheck:
portSpecification: USE_SERVING_PORT
proxyHeader: NONE
requestPath: /
id: {{REDACTED}}
kind: compute#healthCheck
logConfig:
enable: true
name: k8s1-4622dadc-{{REDACTED}}
selfLink: https://www.googleapis.com/compute/v1/projects/{{REDACTED}}
timeoutSec: 15
type: HTTP
unhealthyThreshold: 2
And the following for port 4000
, which surprisingly contains the right path
and port
configuration:
checkIntervalSec: 20
creationTimestamp: '2022-02-07T12:12:59.248-08:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
httpHealthCheck:
port: 5000
portSpecification: USE_FIXED_PORT
proxyHeader: NONE
requestPath: /favicon.ico
id: {{REDACTED}}
kind: compute#healthCheck
name: k8s1-4622dadc-{{REDACTED}}
selfLink: https://www.googleapis.com/compute/v1/projects/{{REDACTED}}
timeoutSec: 15
type: HTTP
unhealthyThreshold: 2
Either there's something wrong with my setup, or the BackendConfig is not equally applied for all services in the rules section of the Ingress.
Edit #3:
The Healt Check log entry for port 4000
shows:
healthCheckProbeResult: {
detailedHealthState: "TIMEOUT"
healthCheckProtocol: "HTTP"
healthState: "UNHEALTHY"
ipAddress: "10.40.129.203"
previousDetailedHealthState: "UNKNOWN"
previousHealthState: "UNHEALTHY"
probeCompletionTimestamp: "2022-02-07T20:06:10.955412154Z"
probeRequest: "/favicon.ico"
probeResultText: "HTTP response: , Error: Connection refused"
probeSourceIp: "35.191.12.114"
responseLatency: "0.000569s"
targetIp: "10.40.129.203"
targetPort: 5000
}
EDIT #4: I needed to adjust the BackendConfig annotation, as my case was in fact more involved compared to what my previous definitions were showing.
CodePudding user response:
The problem was in the way I applied the BackendConfig annotation of the service.
My assumption was that the "default"
config would be applied to both ports and the extra header config would additionally be applied to port 4001.
But this is not the case.
In case a BackendConfig deviates from the default, you'll have to define a one for each case where it deviates from the HTTP ports. BackendConfigs will not get merged and you can not add multiple configs per port.
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: config-4000
spec:
healthCheck:
type: HTTP
port: 5000
requestPath: "/favicon.ico"
checkIntervalSec: 20
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: config-4001
spec:
healthCheck:
type: HTTP
port: 5000
requestPath: "/favicon.ico"
checkIntervalSec: 20
headers:
- "X-Client-Region:{client_region}"
---
apiVersion: v1
kind: Service
metadata:
name: some-app-service
annotations:
cloud.google.com/backend-config: '{"ports":{
"4000":"config-4000",
"4001":"config-4001"
}}'
spec:
type: NodePort
selector:
app: some-app
ports:
- name: web
targetPort: 5000
protocol: TCP
port: 4000
- name: admin
targetPort: 5001
protocol: TCP
port: 4001
EDIT #1:
An additional issue (resulting in the Connection Error) was in my project config:
- The server was configured to bind to
127.0.0.1
instead of0.0.0.0
CodePudding user response:
Can you do a kubectl describe ing ingress
?
I think you should see an error about an invalid wildcard. You should leave the *
out.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress
annotations:
kubernetes.io/ingress.class: gce
spec:
rules:
- host: admin.example.com
http:
paths:
- path: /
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4001
- http:
paths:
- path: /
pathType: ImplementationSpecific
backend:
service:
name: some-app-service
port:
number: 4000