Home > database >  HPA creates more pods than expected
HPA creates more pods than expected

Time:10-15

I created HPA on our k8s cluster which should auto-scale on 90% memory utilization. However, it scales UP without hitting the target percentage. I use the following config:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  namespace: {{ .Values.namespace }}
  name: {{ include "helm-generic.fullname" . }}
  labels:
    {{- include "helm-generic.labels" . | nindent 4 }}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ include "helm-generic.fullname" . }}
  minReplicas: 1
  maxReplicas: 2
  metrics:
    - type: Resource
      resource:
        name: memory
        targetAverageUtilization: 90

So for this config it creates 2 pods which is the maxReplicas number. If I add 4 for maxReplicas it will create 3.

This is what i get from kubectl describe hpa

$ kubectl describe hpa -n trunkline

Name:                                                     test-v1
Namespace:                                                trunkline
Labels:                                                   app.kubernetes.io/instance=test-v1
                                                          app.kubernetes.io/managed-by=Helm
                                                          app.kubernetes.io/name=helm-generic
                                                          app.kubernetes.io/version=0.0.0
                                                          helm.sh/chart=helm-generic-0.1.3
Annotations:                                              meta.helm.sh/release-name: test-v1
                                                          meta.helm.sh/release-namespace: trunkline
CreationTimestamp:                                        Wed, 12 Oct 2022 17:36:54  0300
Reference:                                                Deployment/test-v1
Metrics:                                                  ( current / target )
  **resource memory on pods  (as a percentage of request):  59% (402806784) / 90%**
  resource cpu on pods  (as a percentage of request):     11% (60m) / 80%
Min replicas:                                             1
Max replicas:                                             2
Deployment pods:                                          **2 current / 2** desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:           <none>

As you see the pods memory % is 59 , with target 90 which I expect to produce only 1 pod.

CodePudding user response:

This is working as intended.

targetAverageUtilization is an average over all the matching Pods that is targeted.

The idea of HPA is:

  • scale up? We have 2 Pods, average memory utilization is only 59%, this is under 90%, no need to scale up
  • scale down? Since 59% is the average for 2 Pods under the current load, then if there was only one Pod taking all load it would raise to 59%*2=118% utilization, which is over 90% so we need to scale up again, so not scaling down

CodePudding user response:

The horizontal pod autoscaler has a very specific formula for calculating the target replica count:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

With the output you show, currentMetricValue is 59% and desiredMetricValue is 90%. Multiplying that by the currentReplicas of 2, you get about 1.3 replicas, which gets rounded up to 2.

This formula, and especially the ceil() round-up behavior, can make HPA very slow to scale down, especially with a small number of replicas.

More broadly, autoscaling on Kubernetes-observable memory might not work the way you expect. Most programming languages are garbage-collected (C, C , and Rust are the most notable exceptions) and garbage collectors as a rule tend to allocate a large block of operating-system memory and reuse it, rather than return it to the operating system if load decreases. If you have a pod that reaches 90% memory from the Kubernetes point of view, its possible that memory usage will never decrease. You might need to autoscale on a different metric, or attach an external metrics system like Prometheus to get more detailed memory-manager statistics you can act on.

  • Related