The 2017 Docker Usage Report
Our high level answer on Docker usage: It’s growing, both in scale but also sophistication. Read on to find out more.
We analyzed usage of 45,000 #Docker containers through #monitoring data. Here's what we found.Click to tweet
The data you see here represents a snapshot of our customer behavior in early spring 2017. We have aggregated the data across all environments to provide unique insight into how our customers are using Docker containers right now. This is not meant to be a broad statement on the market as a whole - we can only speak to what we can report on from our own data.
We wanted to see if users were actually seeing this benefit. We asked the question, "Of customers running containers in their environments, what is the median number of containers being run?” At a median of ten containers per host, it seems as though Docker users are beginning to see some benefit in density. Note, deriving this median was a two step process: this median is derived by first calculating the average number of containers per host across each customer, and then calculating the median across customers. This felt like a more accurate way to understand where customers are with regards to density. We also saw a very broad range of deployment densities.
We saw some customers running as high as 95 containers per host, and others who are running at a single container per host. The latter situation, while it may sound unusual, has underlying logic. Talking to those customers, we have learned that their core software uses all the resources on the machine. The benefit for containerization to them is not providing container density. Rather, it’s the ability to develop, deploy, and scale software more quickly where Docker adds value.
monitoring Kubernetes has been a hot topic around the Sysdig office. However, with the newer version of Swarm, and the increasing efforts behind Open DC/OS, we have seen an increased variety of orchestrator platforms in use. These platforms have not caught up to Kubernetes within our customer base yet, but we expect to see continued competition here as these platforms mature. The results above also pull in enterprise-class distributions of the base orchestrator.
- Kubernetes includes Openshift, Tectonic and others
- Mesos includes Mesosphere DC/OS
- Swarm includes both the older version and native Swarm
Most popular container registries
A container registry is a place to store and distribute Docker images. You can think of a registry as a gas station - no matter where you go the gas is really the same, but one station might have decent coffee, another throws in a free car wash, and yet another is attached to the Safeway so you can get your groceries and gas at once.
Anyone who is seriously using Docker containers will likely be using a registry, either as a service or as on premise software (aka a private registry.) As you can see from the numbers above, even the top three combined don’t constitute a majority of the Docker containers in use within the sample. That’s indicative of how fragmented the registry offerings are. There are two footnotes to this data: (1) these percentages are based on container count, not based on registry per customer; (2) For a portion of our sample, the data did not allow us to accurately identify which container registry is in use. We assume that these are evenly distributed - nothing to our knowledge would indicate otherwise.
auto-discovery mechanism gives us a method to understand which open source components users are adopting inside their containers, without any explicit configuration or management by users. Above is a list of the twelve most common application checks we see running.
#Nginx, #etcd, & #Consul top the list of components running inside #Docker #containersClick to tweet
Of interest to us was that Nginx was the most popular. It is frequently serving at the endpoint of a service, or functioning as a traditional load balancer. Also somewhat surprising to us was that consul was running so closely to etcd - we expected more distance between the two. Note this data does not focus on programming languages: most users are writing at least some custom application code in various languages and we see those as “custom apps” if a customer chooses to write their own application check.
Popular alert conditions used with Docker containers
We wanted to better understand if docker adoption had changed the way people architect and operate their software. While alerts are not a perfect way to understand such a deep question, they do provide a glimpse into how people are thinking about their software in production. Most importantly, this data shows us a shift away from host/physical infrastructure alerting is in process, but older alerting conditions are still in use.
The image above shows you the most popular alert conditions, though they are not ranked. We felt that, given alert conditions really need to be paired with scope (see next section), it made the most sense to leave off the actual count of conditions used.
For example, tried-and-true alerts such as High CPU/Memory/Disk are still very much in use. At the same there are newer alert conditions which better relate to today’s abstracted infrastructure. A good example here is “Pod Restart Count”. We also see a number of alerts that adapted for the modern age: “Entity is Down” now is used across containers in addition to hosts, and “High CPU shares” is a CPU alert tuned for containers.
Finally, we see a number of alerts that haven’t changed nor should they. Alerts on the response time of a service and alerts on HTTP Error Codes continue to be popular because they represent the state of the application, not just the state of container infrastructure. The biggest change with these alerts tends to be the scope across which they trigger, which we cover next.
The most common number of tags we see used to scope out an alert are 2 to 3 tags. Many of the alert scopes are actually operating across orchestrator constructs, such as a Kubernetes Deployment, Pod, or something similar. We also see, however, that many people are using the container name, which implies the user has a more personal understanding of what is running where inside their infrastructure.
Finally, we do see some important tagging related to physical infrastructure as well. Cloud provider tags typically represent some physical aspect of deployment, such as the host, region or availability zone. The “Role of the Host” is typically applied at the Sysdig agent. Given users only need to deploy a single sysdig agent per host - regardless of the number of containers or pods - this implies they are using the tag in relation to physical infrastructure.
Conclusion: Docker usage continues to advanceOur experience in the Docker ecosystem is that users continue to advance in both the scale and sophistication of their Docker usage. This report gives the reader a good sense of where users are today, and provides a complement to survey-based reports that provide more insight on intention or objectives. We expect to repeat this report in about a year, and imagine that we will see significant changes in that time.
Would you like a downloadable PDF version of this report? Grab it here.
Btw, we are running a webinar discussing the challenges of troubleshooting issues and errors in Docker containers and Kubernetes, like pods in CrashLoopBackOff, join this session and learn:
- How to gain visibility into Docker containers with Sysdig open source and Sysdig Inspect.
- Demo: troubleshoot a 502 Bad Gateway error on containerized app with HAproxy.
- Demo: troubleshoot a web application that mysteriously dies after some time.
- Demo: Nginx Kubernetes pod goes into CrashLoopBackOff, what's you can do? Will show you how to find the error without SSHin into production servers.
Start Your Free Trial Today