How to troubleshoot Kubernetes OOM and CPU Throttle

NEW!! Special Event - See Sysdig at AWS re:Invent Nov. 30 - Dec. 18 & Jan. 12-14 9:00 Pacific

Experience Kubernetes OOM kills can be very frustrating. Why is my application struggling if I have plenty of CPU in the node?

Managing Kubernetes pod resources can be a challenge. Many issues can arise, possibly due to an incorrect configuration of Kubernetes limits and requests.

In this article, we will try to help you detect the most common issues related to the usage of resources.

OOM Killed containers being analyzed

Kubernetes OOM problems

When any Unix based system runs out of memory, OOM safeguard kicks in and kills certain processes based on obscure rules only accessible to level 12 dark sysadmins (chaotic neutral). Kubernetes OOM management tries to avoid the system running behind trigger its own. When the node is low on memory, Kubernetes eviction policy enters the game and stops pods as failed. These pods are scheduled in a different node if they are managed by a ReplicaSet. This frees memory to relieve the memory pressure.

OOM kill due to container limit reached

This is by far the most simple memory error you can have in a pod. You set a memory limit, one container tries to allocate more memory than that allowed,and it gets an error. This usually ends up with a container dying, one pod unhealthy and Kubernetes restarting that pod.

test          frontend        0/1     Terminating         0          9m21s

Describe pods output would show something like this:

   State:          Running
      Started:      Thu, 10 Oct 2019 11:14:13 +0200
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Thu, 10 Oct 2019 11:04:03 +0200
      Finished:     Thu, 10 Oct 2019 11:14:11 +0200

  Type    Reason          Age                    From                                                  Message
  ----    ------          ----                   ----                                                  -------
  Normal  Scheduled       6m39s                  default-scheduler                                     Successfully assigned test/frontend to gke-lab-kube-gke-default-pool-02126501-7nqc
  Normal  SandboxChanged  2m57s                  kubelet, gke-lab-kube-gke-default-pool-02126501-7nqc  Pod sandbox changed, it will be killed and re-created.
  Normal  Killing         2m56s                  kubelet, gke-lab-kube-gke-default-pool-02126501-7nqc  Killing container with id docker://db:Need to kill Pod

The Exit code 137 is important because it means that the system terminated the container as it tried to use more memory than its limit.

In order to monitor this, you always have to look at the use of memory compared to the limit. Percentage of the node memory used by a pod is usually a bad indicator as it gives no indication on how close to the limit the memory usage is. In Kubernetes, limits are applied to containers, not pods, so monitor the memory usage of a container vs. the limit of that container.

Kubernetes OOM killFind these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits

Kubernetes OOM kill due to limit overcommit

Memory requested is granted to the containers so they can always use that memory, right? Well, it’s complicated. Kubernetes will not allocate pods that sum to more memory requested than memory available in a node. But limits can be higher than requests, so the sum of all limits can be higher than node capacity. This is called overcommit and it is very common. In practice, if all containers use more memory than requested, it can exhaust the memory in the node. This usually causes the death of some pods in order to free some memory.

Memory limits in kubernetes can be bigger than the available memory in the node

Memory management in Kubernetes is complex, as it has many facets. Many parameters enter the equation at the same time:

  • Memory request of the container.
  • Memory limit of the container.
  • Lack of those settings.
  • Free memory in the system.
  • Memory used by the different containers.

With these parameters, a blender and some maths, Kubernetes elaborates a score. Last in the table is killed or evicted. The pod can be restarted depending on the policy, so that doesn’t mean the pod will be removed entirely.

Despite this mechanism, we can still finish up with system OOM kills as Kubernetes memory management runs only every several seconds. If the system memory fills too quickly, the system can kill Kubernetes control processes, making the node unstable.

Edge case of Kubernetes OOM Kill when no container reaches the limit

This scenario should be avoided as it will probably require a complicated troubleshooting, ending with an RCA based on hypothesis and a node restart.

In day-to-day operation, this means that in case of overcommitting resources, pods without limits will likely be killed, containers using more resources than requested have some chances to die and guaranteed containers will most likely be fine.

CPU throttling due to CPU limit

There are many differences on how CPU and memory requests and limits are treated in Kubernetes. A container using more memory than the limit will most likely die, but using CPU can never be the reason of Kubernetes killing a container. CPU management is delegated to the system scheduler, and it uses two different mechanisms for the requests and the limits enforcement.

CPU requests are managed using the shares system. This means that the resources in the CPU are prioritized depending on the value of shares. Each CPU core is divided into 1,024 shares and the resources with more shares have more CPU time reserved. Be careful, in moments of CPU starvation, shares won’t ensure your app has enough resources, as it can be affected by bottlenecks and general collapse.

Kubernetes pod requests are handled using 1024 shares per CPU

Tip: If a container requests 100m, the container will have 102 shares. These values are only used for pod allocation. Monitoring the shares in a pod does not give any idea of a problem related to CPU throttling.

On the other hand, limits are treated differently. Limits are managed with the CPU quota system. This works by dividing the CPU time in 100ms periods and assigning a limit on the containers with the same percentage that the limit represents to the total CPU in the node.

Kubernetes pod limits can overcomit

Tip: If you set a limit of 100m, the process can use 10ms of each period of processing. The system will throttle the process if it tries to use more time than the quota, causing possible performance issues. A pod will never be terminated or evicted for trying to use more CPU than its quota, the system will just limit the CPU

When a container tries to use more CPU than available it will throttle

If you want to know if your pod is suffering from CPU throttling, you have to look at the percentage of the quota assigned that is being used. Absolute CPU use can be treacherous, as you can see in the following graphs. CPU use of the pod is around 25%, but as that is the quota assigned, it is using 100% and consequently suffering CPU throttling.

Kubernetes CPU throttling
Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits

Kubernetes CPU throttling
Find these metrics in Sysdig Monitor in the dashboard: Hosts & containers → Container limits

There is a great difference between CPU and memory quota management. Regarding memory, a pod without requests and limits is considered burstable and is the first of the list to OOM kill. With the CPU, this is not the case. A pod without CPU limits is free to use all the CPU resources in the node. Well, truth is, the CPU is there to be used, but if you can’t control which process is using your resources, you can end up with a lot of problems due to CPU starvation of key processes.

Lessons learned

Knowing how to monitor resource usage in your workloads is of vital importance. This will allow you to discover different issues that can affect the health of the applications running in the cluster.

Understanding that your resource usage can compromise your application and affect other applications in the cluster is the crucial first step. You have to properly configure your quotas. Monitoring the resources and how they are related to the limits and requests will help you set reasonable values and avoid Kubernetes OOM kills. This will result in a better performance of all the applications in the cluster, as well as a fair sharing of resources.

A good monitoring system like sysdig monitor will help you to ensure you avoid pod eviction and pending pods. Request a demo today!

Stay up to date

Sign up to receive our newest.

Related Posts

How to monitor Golden signals in Kubernetes

Understanding Kubernetes limits and requests by example

Understanding Kubernetes pod evicted and scheduling problems