Home > OS >  Prometheus query to get Pods in "Error|Completed" state
Prometheus query to get Pods in "Error|Completed" state

Time:10-27

I am trying to get list of Pods that got into "Error" or "Completed" state (from ns1 and ns2 namespaces) in the last 5 minutes. I tried using following query but no luck:

kube_pod_status_phase{namespace=~".*ns1|.*ns2",phase!~"Succeeded|Pending|Running|Unknown"}[5m]

Am I doing something wrong here?

CodePudding user response:

kube-state-metrics ServiceMonitor port (port: http) was not matching the service port hence metrics was not showing up in prometheus.

ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus-stack
    meta.helm.sh/release-namespace: insights
  creationTimestamp: "2022-09-07T05:08:57Z"
  generation: 6
  labels:
    app.kubernetes.io/component: metrics
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-state-metrics
    app.kubernetes.io/version: 2.4.1
    helm.sh/chart: kube-state-metrics-4.7.0
    release: kube-prometheus-stack
  name: kube-prometheus-stack-kube-state-metrics
  namespace: monitoring
  resourceVersion: "16414474"
  uid: ca3fb63d-c12e-4200-8804-a97a327a244d
spec:
  endpoints:
  - honorLabels: true
    metricRelabelings:
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(. )
      replacement: pod_label_$1
    - action: labelmap
      regex: __meta_kubernetes_pod_host_(. )
      replacement: pod_host_$1
    port: http
    relabelings:
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(. )
      replacement: pod_label_$1
    - action: labelmap
      regex: __meta_kubernetes_pod_host_(. )
      replacement: pod_host_$1
  jobLabel: app.kubernetes.io/name
  selector:
    matchLabels:
      app.kubernetes.io/instance: kube-prometheus-stack
      app.kubernetes.io/name: kube-state-metrics

Service:

apiVersion: v1
kind: Service
metadata:
  annotations:
    meta.helm.sh/release-name: kube-prometheus-stack
    meta.helm.sh/release-namespace: insights
    prometheus.io/scrape: "true"
  creationTimestamp: "2022-09-07T05:08:53Z"
  labels:
    app.kubernetes.io/component: metrics
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-state-metrics
    app.kubernetes.io/version: 2.4.1
    helm.sh/chart: kube-state-metrics-4.7.0
    release: kube-prometheus-stack
  name: kube-prometheus-stack-kube-state-metrics
  namespace: monitoring
  resourceVersion: "21563"
  uid: 68aace21-e164-485b-98b3-878d12c45701
spec:
  clusterIP: 172.20.83.175
  clusterIPs:
  - 172.20.83.175
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/name: kube-state-metrics
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}
  • Related