In this article, you’ll learn how to configure Keda to deploy a Kubernetes HPA that uses Prometheus metrics.
The Kubernetes Horizontal Pod Autoscaler can scale pods based on the usage of resources, such as CPU and memory. This is useful in many scenarios, but there are other use cases where more advanced metrics are needed – like the waiting connections in a web server or the latency in an API. Also, in other cases, you might need to combine multiple metrics in a formula or make aggregations.
Keda is an open source project that allows using Prometheus queries, along with multiple other scalers, to scale Kubernetes pods.
Kubernetes HPA
Kubernetes HPA can scale objects by relying on metrics present in one of the Kubernetes metrics API endpoints. You can read more about how Kubernetes HPA works in this article.
Kubernetes HPA is very helpful, but it has two important limitations. The first is that it doesn’t allow combining metrics. There are scenarios where combining multiple metrics is convenient, such as calculating the connection usage with the current number of established connections and the maximum number of connections.
The second limitation is the reduced number of metrics that Kubernetes exposes by default: just CPU and memory usage. Sometimes, applications expose more advanced metrics, either by themselves or through exporters. To expose more metrics, you need to publish them in the Kubernetes API metrics endpoint.
Connecting HPA and Prometheus metrics with KEDA
Keda is an open source project that simplifies using Prometheus metrics for Kubernetes HPA.
Installing Keda
The easiest way to install Keda is using Helm.
helm repo add kedacore https://kedacore.github.io/charts helm repo update kubectl create namespace keda helm install keda kedacore/keda --namespace keda
You can check out Keda’s documentation page for other installation methods.
How Keda does it?
Keda has a Kubernetes operator that creates both the metrics server and the HPA by defining a Custom Resource Definition (CRD) object called ScaledObject. This object allows you to define what you want to scale and how you want to scale it.
What to scale
Easy: almost anything.
With Keda, you can scale the usual Kubernetes workloads, like Deployments
or StatefulSets
. Also, you can scale other CRDs – it even has another CRD to scale jobs.
How to scale
This is where the magic is done. You can define triggers in Keda, and there are a lot of different types of them. This article is focused on the Prometheus trigger.
When you set up a Prometheus trigger for a ScaledObject
, you define a Prometheus endpoint and a Prometheus query. Keda uses that information to query your Prometheus server and create a metric in the Kubernetes external metrics API. Once you create the ScaledObject
, Keda automatically creates the Kubernetes HPA for that.
That’s it. You don’t need to worry about publishing metrics in the Kubernetes API metrics endpoint or even creating the Kubernetes HPA object!
An example, please
Imagine that you want an HPA for the nginx-server
deployment. You want it to scale from 1
to 5
replicas, based on the nginx_connections_waiting
metric from the Nginx exporter. If there are more than 500
waiting connections, then you want to schedule a new pod.
Let’s create the query to trigger the HPA:
sum(nginx_connections_waiting{job="nginx"})
Easy, right? This query just returns the sum of the nginx_connections_waiting metric value for the nginx
job.
Want to learn more about PromQL, the Prometheus query language? Check out the PromQL getting started guide – it also includes a cheatsheet!
Let’s define the ScaledObject
for this example:
apiVersion: keda.sh/v1alpha1 kind: ScaledObject metadata: name: nginx-scale namespace: keda-hpa spec: scaleTargetRef: kind: Deployment name: nginx-server minReplicaCount: 1 maxReplicaCount: 5 cooldownPeriod: 30 pollingInterval: 1 triggers: - type: prometheus metadata: serverAddress: https://prometheus_server/prometheus metricName: nginx_connections_waiting_keda query: | sum(nginx_connections_waiting{job="nginx"}) threshold: "500"
Notice the metricName
parameter. This is a custom name you set for receiving the value from the query. Keda gets the result of the query and creates the nginx_connections_waiting_keda
metric with it. Then, it uses this metric to trigger the escalation. Also, remember to change the serverAddress
. :)
Now, you simply need to apply the ScaledObject
definition, and the HPA will start working.
What else does Keda offer?
Along with all the benefits of using the metrics in your Prometheus server and applying Prometheus queries to combine them as you want, Keda has additional special features.
- It allows you to scale down an object to zero, while the default Kubernetes HPA only allows a minimum value equal or greater than 1.
- It allows defining the number of replicas in case it’s unable to get the value from the metric, e.g. in an error connection.
- It supports a secure connection with Prometheus endpoints with authentication.
Putting it all together
In this article, you learned how to create a Kubernetes HPA easily, without the need to extend the Kubernetes API metrics endpoint. Just by installing and configuring Keda.
In the examples, you also learned how to use a Prometheus PromQL query to trigger the autoscaler.
Don’t want to set up a Prometheus server?
Register now for the free Sysdig Monitor trial and use the native Prometheus queries in Sysdig’s managed Prometheus service to trigger your HPA!