How to mitigate kubelet's CVE-2021-25741: Symlink exchange can allow host filesystem access

CVE-2021-25741 is a new vulnerability discovered in Kubernetes that allows users to create a container with subpath volume mounts to access files & directories outside of the volume, including the host filesystem.

It was disclosed in September 2021 and affects kubelet, which is the node agent that runs on each Kubernetes node. In particular CVE-2021-25741 affects kubelet in these Kubernetes versions:

v1.22.0 – v1.22.1
v1.21.0 – v1.21.4
v1.20.0 – v1.20.10
<= v1.19.14

As reported in the official issue opened in the Kubernetes repository, the severity level is high (CVSS 8.8) both because the attack complexity is low and because of its potential high impact. Users who are able to exploit it, gaining access to the host filesystem, can compromise the Kubernetes node and also all the other running containers, if any. This means that host configuration may be altered, and its data may be exfiltrated, modified, and deleted. So confidentiality, integrity and availability (CIA) of any data may be at risk!

In order to better understand the real impact of this vulnerability, we will explore how to mitigate and detect CVE-2021-25741.

Preliminary: Volumes on Kubernetes

Volumes in Kubernetes solve different problems related to the ephemeral nature of container storage.

When you run a container and store a new file directly inside its filesystem, then if the container crashes you will lose that file. Using a volume, instead, you can persist data beyond the container or pod’s lifetime.

You may also want files to be accessible by multiple containers within the same pod. Volumes allow you to achieve this in a consistent way.

So, volumes are directories that can be accessed by pod’s containers. These volumes can be of different types and are specified directly within pod specification at start time. One of these volume types is subPath. It is usually used when you want to share one volume for multiple purposes among different containers within a single pod. This is the scenario affected by the CVE-2021-25741 and the one that we will cover in this article.

How does subpath mounting work?

When a container is started, the kubelet volume manager locally mounts all the volumes required, and specified by the pod under a dedicated directory on the host system. The information about the volume mounts is then returned back to the container because it needs:

The path of the volume in the container.
The path of the volume on the host.

In this way, the container at start time can create the path in the container root filesystem, bind mounting it to the provided host path. This means that the container root filesystem will be replicated in the host one, and any modification will be immediately reflected.

The CVE-2021-25741 issue

The issue created by this vulnerability is related to the symlink resolution of the utils-linux mount. As a matter of fact, K8s doesn’t use the mount syscall directly, but uses the utils-linux one which resolves symlinks.

So, when using Volume Subpath, Kubernetes mounts the volume and then performs bind mounting of the subpath, giving it to the runtime. By the way, the symlink exchange should be prevented, since the mount operation is controlled by the container that was created, and so sometimes also by malicious users.

Here are reported the real issue behind this vulnerability and its resolution:

Screenshot of the kubelet code showing how MountSensitiveWithoutSystemd was being used instead of MountSensitiveWithoutSystemdWithMountFlags

The old function “MountSensitiveWithoutSystemd” (in red) is replaced by "MountSensitiveWithoutSystemdWithMountFlags" (in green) and a new parameter is added, which is the string "--no-canonicalize". This because the utils-linux mount canonicalizes by default all paths, as reported also in the documentation:

-c, --no-canonicalize
Don't canonicalize paths. The mount command canonicalizes 
	all paths (from the command line or fstab) by default...

In order to understand better what the malicious user can obtain by exploiting it, now we are going to explore the potential impact of this vulnerability.

The impact of CVE-2021-25741

According to the CVSS system, this is a high severity security issue with a score of 8.8.

To learn more about how a vulnerability score is calculated, Are Vulnerability Scores Tricking You? Understanding the severity of CVSS and using them effectively

No public exploit has been released for this vulnerability yet, but this high scoring is strictly related to the low attack complexity.

What is even more important is that the impact of the CVE-2021-25741 is very severe. Gaining access to the host filesystem means obtaining control of the host itself and also having access to the host data, which may contain data related to other containers. This may enable a malicious user to manipulate the host environment and to access, tamper, and delete data within its filesystem.

By the way, it seems that this vulnerability can only be exploited whenever the containers are run as root, which fortunately means that the user that wants to spawn them requires some privileges.

The environments that may be most seriously affected by it are those where the ability to create hostPath mounts is restricted.
This is because the CVE exploitation bypasses that restriction, allowing hostPath-like access without the use of the hostPath feature.

Remediation of CVE-2021-25741

CVE-2021-25741 was fixed in the following Kubernetes versions:

v1.22.2
v1.21.5
v1.20.11
v1.19.15

So, if your environment is affected by this vulnerability, you should update your Kubernetes version as soon as possible.

If immediate remediation is not possible, you can mitigate the risk by adopting other strategies. Keep reading to discover how to mitigate its impact and detect its exploitation.

Mitigating CVE-2021-25741

As suggested in the official Kubernetes repository, in order to mitigate this vulnerability you can disable the Volume Subpath feature on kubelet and kube-apiserver, removing those pods that are already using it.

But you can also take another solution and deploy OPA as an admission controller.

Mitigating at the admission controller level

The Kubernetes admission controller intercepts and processes requests to the K8s API, enforcing semantic objects validation, before the persistence of the objects that you want to create and after the request is authorized and authenticated. So, for example, the AC can block pods from running if the cluster is out of resources, if the images are not secure, or if the privileges granted to pods or deployments are not respected.

You can read more about Kubernetes admission controllers in:
Kubernetes admission controllers in 5 minutes →
Shielding your Kubernetes runtime with image scanning on admission controller →

Architecture of OPA Gatekeeper and how it communicates with the Kubernetes apiserver

You can adopt OPA Gatekeeper as an admission policy mechanism in order to create policies that enforce specific restrictions.

Since CVE-2021-25741 can be exploited when the user deploys a container with root permissions and with the Subpath field, we can try to leverage OPA in order to enforce the following policy:

OPA policy to block deploys of containers with root permissions and SubPath spec.

So, deploying a container with root permissions and with “SubPath” spec, this is what will be printed out:

Error from server: error when creating "test-pod.yaml": admission webhook "validating-webhook.openpolicyagent.org" denied the request: Container nginx may exploit CVE-2021-25741.

Pre and post-exploitation detection of CVE-2021-25741

Even though no public exploit for this CVE has been released, it’s important to monitor the vulnerable systems for associated suspicious activity. We can detect some behaviors that may be indicative of compromise by alerting on the deployment of root containers or in case of access to the host filesystem. In order to detect these suspicious events we will use Falco.

Falco is the CNCF open-source project, used to detect unexpected application behavior and to send alerts at runtime. It is based on a formal and simple language, with a predefined set of rules. By the way, these default rules can be customized or extended in order to obtain a well targeted detection within your environment.

Let’s take a look at some default Falco rules that help you to monitor some suspicious behaviors.

Falco rule: Container run as root user

As we saw previously, running containers as root may allow malicious users to take advantage of CVE-2021-25741. In order to detect the presence of these containers you may use the following rule:

- rule: Container Run as Root User
  desc: Detected container running as root user
  condition: spawned_process and container and proc.vpid=1 and user.uid=0 and not user_known_run_as_root_container
  enabled: false
  output: Container launched with root user privilege (uid=%user.uid container_id=%container.id container_name=%container.name image=%container.image.repository:%container.image.tag)
  priority: INFO
  tags: [container, process]

Falco rule: Read sensitive file untrusted

This rule detects read attempts performed over sensitive files, like user/password information.

- rule: Read sensitive file untrusted
  desc: >
    an attempt to read any sensitive file (e.g. files containing user/password/authentication information). Exceptions are made for known trusted programs.
  condition: >
    sensitive_files and open_read
    and proc_name_exists
    and not proc.name in (user_mgmt_binaries, userexec_binaries, package_mgmt_binaries,
     cron_binaries, read_sensitive_file_binaries, shell_binaries, hids_binaries,
     vpn_binaries, mail_config_binaries, nomachine_binaries, sshkit_script_binaries,
     in.proftpd, mandb, salt-minion, postgres_mgmt_binaries,
     google_oslogin_
     )
    and not cmp_cp_by_passwd
    and not ansible_running_python
    and not run_by_qualys
    and not run_by_chef
    and not run_by_google_accounts_daemon
    and not user_read_sensitive_file_conditions
    and not mandb_postinst
    and not perl_running_plesk
    and not perl_running_updmap
    and not veritas_driver_script
    and not perl_running_centrifydc
    and not runuser_reading_pam
    and not linux_bench_reading_etc_shadow
    and not user_known_read_sensitive_files_activities
    and not user_read_sensitive_file_containers
  output: >
    Sensitive file opened for reading by non-trusted program (user=%user.name user_loginuid=%user.loginuid program=%proc.name
    command=%proc.cmdline file=%fd.name parent=%proc.pname gparent=%proc.aname[2] ggparent=%proc.aname[3] gggparent=%proc.aname[4] container_id=%container.id image=%container.image.repository)
  priority: WARNING
  tags: [filesystem, mitre_credential_access, mitre_discovery]

Falco rule: Search private Keys or Passwords

Allows to detect if the attacker is trying to locate sensitive data or files.

- rule: Search Private Keys or Passwords
  desc: >
    Detect grep private keys or passwords activity.
  condition: >
    (spawned_process and
     ((grep_commands and private_key_or_password) or
      (proc.name = "find" and (proc.args contains "id_rsa" or proc.args contains "id_dsa")))
    )
  output: >
    Grep private keys or passwords activities found
    (user=%user.name command=%proc.cmdline container_id=%container.id container_name=%container.name
    image=%container.image.repository:%container.image.tag)
  priority:
    WARNING
  tags: [process, mitre_credential_access]

Falco rule: Read ssh information:

This rule may detect any read attempt below ssh directories.

- rule: Read ssh information
  desc: Any attempt to read files below ssh directories by non-ssh programs
  condition: >
    (consider_ssh_reads and
     (open_read or open_directory) and
     (user_ssh_directory or fd.name startswith /root/.ssh) and
     (not proc.name in (ssh_binaries)))
  output: >
    ssh-related file/directory read by non-ssh program (user=%user.name
    command=%proc.cmdline file=%fd.name parent=%proc.pname pcmdline=%proc.pcmdline)
  priority: ERROR
  tags: [filesystem, mitre_discovery]
Add a new Falco rule

Conclusion

CVE-2021-25741 allows users to access the host filesystem whenever they are able to create a container as root with subpath volume mounts. This can lead to disclosure of node’s sensitive data and, potentially, also to the compromise of the node itself and of the other containers.

Don’t hesitate to update your Kubernetes cluster to the newer version, if it is affected by this vulnerability, otherwise your cluster integrity and its data may be at risk!

If you want to mitigate the impact of CVE-2021-25741, you can use the admission controller. Instead, in order to detect suspicious post-exploitation behaviors, you can also adopt Falco with its default rules or with some custom ones.

If you would like to find out more about Falco:

Get started in Falco.org.
Check out the Falco project in GitHub.
Get involved with the Falco community.
Meet the maintainers on the Falco Slack.
Follow @falco_org on Twitter.

At Sysdig Secure, we extend Falco with out-of-the-box rules along with other open source projects, making them even easier to work with and manage Kubernetes security. Register for our Free 30-day trial and see for yourself!

How to mitigate kubelet’s CVE-2021-25741: Symlink exchange can allow host filesystem access