GCE Load Balancer Health Check Fails (Connection Refused)-CodePudding

My (GCE) Load Balancer health checks are failing with a connection refused error, ultimately marking my GCE Ingress as UNHEALTHY. Now I'm wondering how to fix this issue.

For my setup I'm using a GKE Autopilot cluster. And I have teared down and restarted my setup several times, always leading to the same result.

Suppose I have a deployment configured with a pod template consisting of several containers (of which not all of them expose ports).

Side Note: For simplicity I skipped some configurations, such as config maps.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: some-app-deployment
spec:
  selector:
    matchLabels:
      app: some-app
  replicas: 2
  template:
    metadata:
      labels:
        app: some-app
    spec:
      restartPolicy: Always
      containers:
        - name: web-server
          image: {{some-app-image}}
          command: ['app', 'web-server']
          ports:
            - name: web-server
              containerPort: 5000
              protocol: TCP
        - name: admin-server
          image: {{some-app-image}}
          command: ['app', 'admin-server']
          ports:
            - name: admin-server
              containerPort: 5001
              protocol: TCP
        - name: worker
          image: {{some-app-image}}
          command: ['app', 'worker']
        - name: cron
          image: {{some-app-image}}
          command: ['app', 'cron']
        - name: helper
          image: {{some-app-image}}
          command: [ '/bin/bash', '-c', '--' ]
          args: [ 'while true; do sleep 30; done;' ]

The following is the BackendConfig CRD, which supposedly defines the health check. I chose to define the path as /favicon.ico because Load Balancer Health Check requires exactly 200 OK response and the web-servers base path / emits a redirect 302, hence it would faild the Health Check. With kubectl port-forward I confirmed that /favicon.ico actually emits 200 OK and it does. Btw. just to exclude this from being a problem, I also tried other paths without success.

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: http-hc-config
spec:
  healthCheck:
    type: HTTP
    port: 5000
    requestPath: "/favicon.ico"
    checkIntervalSec: 20

Additionally there is a custom header added to the admin endpoint.

apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: x-header-config
spec:
  customRequestHeaders:
    headers:
    - "X-Client-Region:{client_region}"

The following is the service description. It references the BackendConfig via annotations as per the documentation. The documentation is not very specific as to how to reference the health check, so I mapped it to the relevant port.

apiVersion: v1
kind: Service
metadata:
  name: some-app-service
  annotations:
    cloud.google.com/backend-config: '{
      "default":"http-hc-config",
      "ports":{"4001":"x-header-config"}}'
spec:
  type: NodePort
  selector:
    app: some-app
  ports:
    - name: web
      targetPort: 5000
      protocol: TCP
      port: 4000
    - name: admin
      targetPort: 5001
      protocol: TCP
      port: 4001

I confirmed that my service is running, as I could successfully kubectl port-forward the pods and access the web-servers content.

Now the final piece of the setup is this ingress object.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress
  annotations:
    kubernetes.io/ingress.class: gce
spec:
  rules:
    - host: admin.example.com
      http:
        paths:
        - path: /*
          pathType: ImplementationSpecific
          backend:
            service:
              name: some-app-service
              port:
                number: 4001
    - http:
        paths:
        - path: /*
          pathType: ImplementationSpecific
          backend:
            service:
              name: some-app-service
              port:
                number: 4000

EDIT #1 & #2: Performing a gcloud compute health-checks describe {{HEALTH_DEF_ID}} I receive the following output for the health check on port 4001, which seems to be out-of-sync with the defined BackendConfig CRD:

checkIntervalSec: 15
creationTimestamp: '2022-02-07T11:30:50.059-08:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
httpHealthCheck:
  portSpecification: USE_SERVING_PORT
  proxyHeader: NONE
  requestPath: /
id: {{REDACTED}}
kind: compute#healthCheck
logConfig:
  enable: true
name: k8s1-4622dadc-{{REDACTED}}
selfLink: https://www.googleapis.com/compute/v1/projects/{{REDACTED}}
timeoutSec: 15
type: HTTP
unhealthyThreshold: 2

And the following for port 4000, which surprisingly contains the right path and port configuration:

checkIntervalSec: 20
creationTimestamp: '2022-02-07T12:12:59.248-08:00'
description: Default kubernetes L7 Loadbalancing health check for NEG.
healthyThreshold: 1
httpHealthCheck:
  port: 5000
  portSpecification: USE_FIXED_PORT
  proxyHeader: NONE
  requestPath: /favicon.ico
id: {{REDACTED}}
kind: compute#healthCheck
name: k8s1-4622dadc-{{REDACTED}}
selfLink: https://www.googleapis.com/compute/v1/projects/{{REDACTED}}
timeoutSec: 15
type: HTTP
unhealthyThreshold: 2

Either there's something wrong with my setup, or the BackendConfig is not equally applied for all services in the rules section of the Ingress.

Edit #3:

The Healt Check log entry for port 4000 shows:

healthCheckProbeResult: {
detailedHealthState: "TIMEOUT"
healthCheckProtocol: "HTTP"
healthState: "UNHEALTHY"
ipAddress: "10.40.129.203"
previousDetailedHealthState: "UNKNOWN"
previousHealthState: "UNHEALTHY"
probeCompletionTimestamp: "2022-02-07T20:06:10.955412154Z"
probeRequest: "/favicon.ico"
probeResultText: "HTTP response: , Error: Connection refused"
probeSourceIp: "35.191.12.114"
responseLatency: "0.000569s"
targetIp: "10.40.129.203"
targetPort: 5000
}

EDIT #4: I needed to adjust the BackendConfig annotation, as my case was in fact more involved compared to what my previous definitions were showing.

CodePudding user response：

The problem was in the way I applied the BackendConfig annotation of the service. My assumption was that the "default" config would be applied to both ports and the extra header config would additionally be applied to port 4001.

But this is not the case.

In case a BackendConfig deviates from the default, you'll have to define a one for each case where it deviates from the HTTP ports. BackendConfigs will not get merged and you can not add multiple configs per port.

---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: config-4000
spec:
  healthCheck:
    type: HTTP
    port: 5000
    requestPath: "/favicon.ico"
    checkIntervalSec: 20
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
  name: config-4001
spec:
  healthCheck:
    type: HTTP
    port: 5000
    requestPath: "/favicon.ico"
    checkIntervalSec: 20
  headers:
    - "X-Client-Region:{client_region}"
---
apiVersion: v1
kind: Service
metadata:
  name: some-app-service
  annotations:
    cloud.google.com/backend-config: '{"ports":{
       "4000":"config-4000",
       "4001":"config-4001"
    }}'
spec:
  type: NodePort
  selector:
    app: some-app
  ports:
    - name: web
      targetPort: 5000
      protocol: TCP
      port: 4000
    - name: admin
      targetPort: 5001
      protocol: TCP
      port: 4001

EDIT #1:

An additional issue (resulting in the Connection Error) was in my project config:

The server was configured to bind to 127.0.0.1 instead of 0.0.0.0

CodePudding user response：

Can you do a kubectl describe ing ingress?

I think you should see an error about an invalid wildcard. You should leave the * out.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress
  annotations:
    kubernetes.io/ingress.class: gce
spec:
  rules:
    - host: admin.example.com
      http:
        paths:
        - path: /
          pathType: ImplementationSpecific
          backend:
            service:
              name: some-app-service
              port:
                number: 4001
    - http:
        paths:
        - path: /
          pathType: ImplementationSpecific
          backend:
            service:
              name: some-app-service
              port:
                number: 4000