Home > OS >  GKE throws invalid certificate when fetching logs
GKE throws invalid certificate when fetching logs

Time:01-21

I'm trying to fetch the logs from a pod running in GKE, but I get this error:

I0117 11:42:54.468501   96671 round_trippers.go:466] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.26.0 (darwin/arm64) kubernetes/b46a3f8" 'https://x.x.x.x/api/v1/namespaces/pleiades/pods/pleiades-0/log?container=server'
I0117 11:42:54.569122   96671 round_trippers.go:553] GET https://x.x.x.x/api/v1/namespaces/pleiades/pods/pleiades-0/log?container=server 500 Internal Server Error in 100 milliseconds
I0117 11:42:54.569170   96671 round_trippers.go:570] HTTP Statistics: GetConnection 0 ms ServerProcessing 100 ms Duration 100 ms
I0117 11:42:54.569186   96671 round_trippers.go:577] Response Headers:
I0117 11:42:54.569202   96671 round_trippers.go:580]     Content-Type: application/json
I0117 11:42:54.569215   96671 round_trippers.go:580]     Content-Length: 226
I0117 11:42:54.569229   96671 round_trippers.go:580]     Date: Tue, 17 Jan 2023 19:42:54 GMT
I0117 11:42:54.569243   96671 round_trippers.go:580]     Audit-Id: a25a554f-c3f5-4f91-9711-3f2970376770
I0117 11:42:54.569332   96671 round_trippers.go:580]     Cache-Control: no-cache, private
I0117 11:42:54.571392   96671 request.go:1154] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Get \"https://10.6.128.40:10250/containerLogs/pleiades/pleiades-0/server\": x509: certificate is valid for 127.0.0.1, not 10.6.128.40","code":500}
I0117 11:42:54.572267   96671 helpers.go:246] server response object: [{
  "metadata": {},
  "status": "Failure",
  "message": "Get \"https://10.6.128.40:10250/containerLogs/pleiades/pleiades-0/server\": x509: certificate is valid for 127.0.0.1, not 10.6.128.40",
  "code": 500
}]

How do I prevent this from happening?

CodePudding user response:

One of the reasons for this error could be because both metrics-server and kubelet listen on port 10250. This is usually not a problem because metrics-server runs in its own namespace but the conflict would have prevented metrics-server from starting when in the host network.

You can confirm this behavior by running the following command :

$ kubectl -n kube-system get pods -l k8s-app=metrics-server -o yaml | grep 10250
          - --secure-port=10250
          - containerPort: 10250

If you can see a hostPort: 10250 in the yaml file of the metrics-server, please run the following command to delete metrics-server deployment on that cluster :

$ kubectl -n kube-system delete deployment -l k8s-app=metrics-server

Metrics server will be recreated correctly by GKE infrastructure. It should be recreated in ~15 seconds on clusters with a new addon manager, but could take up to 15 minutes on very old clusters.

  • Related