The following Prometheus exporters best practices will help you implement a monitoring solution based on Prometheus, and will also increase your productivity.
Prometheus is one of the foundations of the cloud-native environment. It has become the de-facto standard for visibility in Kubernetes environments, creating a new category called Prometheus monitoring. However, many applications in the systems you’re in charge of may be third-party applications and cloud services, which do not natively expose Prometheus-formatted metrics. For example, Linux does not expose Prometheus-formatted metrics. That’s why Prometheus exporters, like the node exporter, exist. You can download and run it to expose hundreds of machine-level metrics.
Secure DevOps, also known as DevSecOps, brings security and monitoring throughout the application lifecycle, from development through production. This sets you up to deliver applications that are secure, stable, and high performance. This workflow plugs into your existing toolchain and provides a single source of truth across DevOps, developer, and security teams to maximize efficiency and provide visibility for troubleshooting and optimization.
In this article, we will take a look at some best practices when monitoring applications and cloud services using Prometheus exporters.
Recently, a customer told us…
“I spend one developer week per integration trying to figure out which version of Prometheus exporters, dashboards and alerts we should use, how to configure them, and how to keep up with open source changes.”
The maintenance burden created when running many Prometheus exporters is often worse than you might think. With so many exporters and configurations to consider, it can sometimes feel like the Wild West. By thinking ahead and arming yourself with these Prometheus exporters best practices, you can make your life easier so you can spend less time maintaining exporters and more time on the things that make a difference for your organization.
1. Find the right Prometheus exporter
As you start using Prometheus, you will soon discover that there are many Prometheus exporters available to monitor the various applications and cloud services that are critical to your application environment. However, you will also be left with the problem of choosing which one to use. One way to handle this would be to personally evaluate how well each candidate exporter provides the metrics you’re interested in, and how healthy the exporter is as a software project. For example, an exporter that was developed years ago by one person, with no recent updates, and a low number of PRs/issues/Github stars is likely unused and unmaintained. Many aspects of metrics and Prometheus exporters best practices are not obvious at first glance, so relying on existing curated lists can help find ones that are likely to work well into the future. Your first stop should be the Exporters and Integrations page on the main Prometheus website. If your application or the protocol it provides metrics over is listed there, then that exporter is probably a good choice. There are also third-party listings of exporters, such as PromCat.io maintained by Sysdig, or the default port allocations page, which inadvertently ended up as a fairly comprehensive list of exporters.data:image/s3,"s3://crabby-images/8417f/8417f86fbd39a44e3f14e5e0826d497812d6a8a2" alt="Promcat is a catalog of curated, documented and supported prometheus monitoring integrations"
2. Understand your exporter metrics
Each exporter comes with its own set of metrics that it exposes. It is a Prometheus exporters best practice to provide that information. Metrics details are usually listed on the exporter’s project page, although sometimes you’ll have to look for it in a help file or a documentation page. If the exporter uses the OpenMetrics format, it can add several fields with extra information to the metric, like type, info, or units.data:image/s3,"s3://crabby-images/7cabd/7cabd2f9d924474475f5eea65ec438c4582979e8" alt="List of prometheus metrics for the elasticsearch exporter"
3. Set alerts that matter and are actionable
Alerts can be challenging when learning a new application monitoring or cloud monitoring tool like Prometheus exporters. If you set your alerts with a low threshold, your support teams will develop alert fatigue. On the other end of the spectrum, if your alerts are not triggering at the right time, you may miss important information about a condition that could be affecting your end-users. The first step in figuring out any alert strategy is to understand your applications, and the same applies to Prometheus exporters. By following the DevOps guidelines of Service Level Indicators and Service Level Objectives along with the golden signals for your application or service, you can determine what critical elements will need an alert. A good monitoring tool with deep visibility and Kubernetes context can help find those critical factors. Working with tools that use native PromQL for alerts can save you time by avoiding the need to translate alerts into another format which can lead to errors. For example, promtool can help you test PromQL alert configurations (among other things). However, your tool for alerts should be able to handle more than just PromQL. Beyond setting basic alerts on any metric or event, you should also be able to output the results of that alert to email, Slack, Pagerduty, Service Now, etc.4. Enable your team to use your data (or not)
Now that you have this valuable data from your Prometheus exporters and you’re using it for application monitoring and cloud service monitoring, make sure your team can see it and use it. The most common way to interact with those metrics is visualizing the data inside Dashboards. To help create them, PromCat.io provides dashboard templates ready to be imported into Grafana or Sysdig Monitor. But how is your team managing those Dashboards? It’s a Prometheus exporters best practice that, instead of each team member creating their own dashboards, a single dashboard is created and shared across the whole DevOps team. Team members can use that as an example and only make minor changes if they need to. To make things work, your monitoring tool should be able to define who can see those dashboards and what level of access they are granted (View Only or Collaborator with edit privileges). Sometimes, you need to be able to restrict access to your metric data as well. If your monitoring tool has full RBAC support, you can provide each team with just the data they need. For example, an application team should access only the metrics being emitted from their namespace, while an on-call team will need read-only access to all production hosts.data:image/s3,"s3://crabby-images/9d5a8/9d5a86888b28353d568fd7a99a69efd916808624" alt="It is a prometheus exporter best practices to use tools with RBAC controls. In the image you can see how a Sysdig Monitor dashboard can be shared with specific people and teams."
data:image/s3,"s3://crabby-images/ffc81/ffc8119c3fb2d3cdbf15d587c1180672142aa6c2" alt="It is one of prometheus exporters best practices, to use tools compatible with PromQL. In the image you can see how Sysdig Monitor let you define dashboards both using PromQL and traditional forms."
5. Have a plan for scale
As you continue to use Prometheus with more and more exporters, bottlenecks on visibility, horizontal scaling, and long term storage might arise. It’s a Prometheus exporters best practice to plan how to scale your Prometheus exporter metrics before you run into problems. Let’s take a look at these Prometheus scaling challenges and some different ways to address them.data:image/s3,"s3://crabby-images/af494/af494686e5b3e6c7e908ead71db223b574288f9d" alt="Vertical scanning, horizontal scanning and Prometheus at scale explained with cats"
data:image/s3,"s3://crabby-images/0e5ab/0e5ab1fd232ce0b826ac1c0fda7fed7911e50b0e" alt="A long term storation solution can help using Prometheus at scale"