Trigger a Kubernetes HPA with Prometheus metrics

By David de Torres Huerta - OCTOBER 7, 2021
Topics: Monitoring


In this article, you’ll learn how to configure Keda to deploy a Kubernetes HPA that uses Prometheus metrics.

The Kubernetes Horizontal Pod Autoscaler can scale pods based on the usage of resources, such as CPU and memory. This is useful in many scenarios, but there are other use cases where more advanced metrics are needed – like the waiting connections in a web server or the latency in an API. Also, in other cases, you might need to combine multiple metrics in a formula or make aggregations.

Keda is an open source project that allows using Prometheus queries, along with multiple other scalers, to scale Kubernetes pods.

Kubernetes HPA

Kubernetes HPA can scale objects by relying on metrics present in one of the Kubernetes metrics API endpoints. You can read more about how Kubernetes HPA works in this article.

Kubernetes HPA is very helpful, but it has two important limitations. The first is that it doesn’t allow combining metrics. There are scenarios where combining multiple metrics is convenient, such as calculating the connection usage with the current number of established connections and the maximum number of connections.

The second limitation is the reduced number of metrics that Kubernetes exposes by default: just CPU and memory usage. Sometimes, applications expose more advanced metrics, either by themselves or through exporters. To expose more metrics, you need to publish them in the Kubernetes API metrics endpoint.

Connecting HPA and Prometheus metrics with KEDA

Keda is an open source project that simplifies using Prometheus metrics for Kubernetes HPA.

Installing Keda

The easiest way to install Keda is using Helm.

helm repo add kedacore
helm repo update
kubectl create namespace keda
helm install keda kedacore/keda --namespace keda

You can check out Keda’s documentation page for other installation methods.

How Keda does it?

Keda has a Kubernetes operator that creates both the metrics server and the HPA by defining a Custom Resource Definition (CRD) object called ScaledObject. This object allows you to define what you want to scale and how you want to scale it.

What to scale

Easy: almost anything.

With Keda, you can scale the usual Kubernetes workloads, like Deployments or StatefulSets. Also, you can scale other CRDs – it even has another CRD to scale jobs.

How to scale

This is where the magic is done. You can define triggers in Keda, and there are a lot of different types of them. This article is focused on the Prometheus trigger.

When you set up a Prometheus trigger for a ScaledObject, you define a Prometheus endpoint and a Prometheus query. Keda uses that information to query your Prometheus server and create a metric in the Kubernetes external metrics API. Once you create the ScaledObject, Keda automatically creates the Kubernetes HPA for that.

That’s it. You don’t need to worry about publishing metrics in the Kubernetes API metrics endpoint or even creating the Kubernetes HPA object!

An example, please

Imagine that you want an HPA for the nginx-server deployment. You want it to scale from 1 to 5 replicas, based on the nginx_connections_waiting metric from the Nginx exporter. If there are more than 500 waiting connections, then you want to schedule a new pod.

Let’s create the query to trigger the HPA:


Easy, right? This query just returns the sum of the nginx_connections_waiting metric value for the nginx job.

Want to learn more about PromQL, the Prometheus query language? Check out the PromQL getting started guide – it also includes a cheatsheet!

Let’s define the ScaledObject for this example:

kind: ScaledObject
 name: nginx-scale
 namespace: keda-hpa
   kind: Deployment
   name: nginx-server
 minReplicaCount: 1
 maxReplicaCount: 5
 cooldownPeriod: 30
 pollingInterval: 1
 - type: prometheus
     serverAddress: https://prometheus_server/prometheus
     metricName: nginx_connections_waiting_keda
     query: |
     threshold: "500"

Notice the metricName parameter. This is a custom name you set for receiving the value from the query. Keda gets the result of the query and creates the nginx_connections_waiting_keda metric with it. Then, it uses this metric to trigger the escalation. Also, remember to change the serverAddress. :)

Now, you simply need to apply the ScaledObject definition, and the HPA will start working.

What else does Keda offer?

Along with all the benefits of using the metrics in your Prometheus server and applying Prometheus queries to combine them as you want, Keda has additional special features.

  • It allows you to scale down an object to zero, while the default Kubernetes HPA only allows a minimum value equal or greater than 1.
  • It allows defining the number of replicas in case it’s unable to get the value from the metric, e.g. in an error connection.
  • It supports a secure connection with Prometheus endpoints with authentication.

Putting it all together

In this article, you learned how to create a Kubernetes HPA easily, without the need to extend the Kubernetes API metrics endpoint. Just by installing and configuring Keda.

In the examples, you also learned how to use a Prometheus PromQL query to trigger the autoscaler.

Subscribe and get the latest updates