I have a spring-boot application running over Kubernetes, Now I am trying to set up a horizontal pod auto scaler.
I have one doubt, without modifying any autoscalar thresholds, does auto scaler check pods only when they are ready(after the readiness probe succeeds) or even when readiness is not complete.
Example
- A Java app takes 5 mins to start(i.e to complete the readiness probe)
- During this 5 mins, CPU for this app with 100% of the CPU requests assigned
- HPA is configured to scale if targetCPUUtilization reaches 50%
- Now what would happen in this case when the HPA condition is satisfied but the pod is not ready yet? Will it add one more pod right away or it will first wait for pods to be ready and then starts the timer for -horizontal-pod-autoscaler-initial-readiness-delay ?
I am assuming answer lies in this, but not clear to me
Due to technical constraints, the HorizontalPodAutoscaler controller cannot exactly determine the first time a pod becomes ready when determining whether to set aside certain CPU metrics. Instead, it considers a Pod "not yet ready" if it's unready and transitioned to unready within a short, configurable window of time since it started. This value is configured with the --horizontal-pod-autoscaler-initial-readiness-delay flag, and its default is 30 seconds. Once a pod has become ready, it considers any transition to ready to be the first if it occurred within a longer, configurable time since it started. This value is configured with the --horizontal-pod-autoscaler-cpu-initialization-period flag, and its default is 5 minutes
Also, can anyone explain horizontal-pod-autoscaler-cpu-initialization-period & horizontal-pod-autoscaler-initial-readiness-delay ? Documentation is confusing
CodePudding user response:
A Digital OCean Predictive Horizontal Pod Autoscaler has the same kind of parameter: cpuInitializationPeriod
.
It rephrases what --horizontal-pod-autoscaler-cpu-initialization-period
as:
the period after pod start when CPU samples might be skipped.
And for horizontal-pod-autoscaler-initial-readiness-delay
the period after pod start during which readiness changes will be treated as initial readiness.
The idea is to:
- not trigger any scaling based on CPU change alone (because the initial
cpu-initialization-period
means the pod is still being ready, with potential CPU spike) - not trigger any scaling based on readiness state changes (because the initial
readiness-delay
means, even if the pod reports it is ready, that can change during that delay)
kubernetes/website
issue 12657 has more (mainly to confirm the original documentation is confusing).