Visibility and Security for GKE Autopilot

By Eric Carter - DECEMBER 14, 2021


Feature image for GKE Autopilot blog

GKE Autopilot from Google Cloud is a mode of operation in Google Kubernetes Engine (GKE) designed to simplify working with Kubernetes in the cloud. Pairing secure DevOps practices with GKE Autopilot will help you and your teams ensure the security, compliance, and performance of your workloads and applications.

Sysdig has collaborated with Google Cloud to enable visibility and security for GKE Autopilot and your containers. In this post, we’ll show you how to get started with the solution and what you can do to follow security best practices.

What is GKE Autopilot?

With GKE Autopilot, Google Cloud provisions and manages all of the infrastructure, including nodes and node pools, underlying your Kubernetes clusters. This means for you there is minimal cluster setup and no need to figure out what the size and shape of nodes should be to support your workloads. You benefit from the experience and practices learned by the Google SRE team in supporting Google Cloud customers on Kubernetes.

To get you started on the right foot, GKE Autopilot follows best practices for cluster hardening, workload setup, and security. For example, GKE Autopilot blocks capabilities considered unsafe or prone to configuration error. This includes, for instance, eliminating ssh to nodes. There is also no way to log into the underlying servers. This helps to prevent accidental or intentional modifications that can cause issues.

The goal of GKE Autopilot is to give you a hands-off, fully-managed solution so you can focus on your workloads versus Kubernetes management. To get started, all you need to do is name the cluster, pick a region, and set the network. In minutes you have a secure platform to support your Kubernetes workloads.

Getting started with GKE Autopilot
Getting started with GKE Autopilot on Google Cloud

GKE Autopilot visibility and security with Sysdig

Using the Sysdig Secure DevOps Platform, you will be able to follow container security best practices on your GKE Autopilot clusters. This includes a suite of capabilities across the DevOps lifecycle including securing containers before they’re deployed as well as keeping a watchful eye on the behavior of your running containers.

Sysig capabilities

Getting started with Sysdig

Monitoring and securing your workloads and clusters begins with deploying the Sysdig Agent. The agent runs as a lightweight node component that observes and processes syscalls, creates capture files, and performs auditing and compliance.

Using Helm, you can install the Sysdig agent on your GKE Autopilot clusters.

kubectl create ns sysdig-agent

helm install sysdig-agent \
--namespace sysdig-agent \
--set sysdig.accessKey=$SDC_ACCESS_KEY \
--set sysdig.settings.collector=$COLLECTOR_URL \
--set sysdig.settings.collector_port=6443 \
--set clusterName=autopilot-lugo sysdig/sysdig \
--set nodeAnalyzer.deploy=false \
--set daemonset.affinity=null \
--set ebpf.enabled=true \
--set ebpf.settings.mountEtcVolume=false \
--set-string daemonset.annotations."autopilot\.gke\.io/no-connect"=true

In the above, note that you will need to define the Access Key and Collector. In addition as was specified in this example,

  • Affinity must be set to null
  • ebpf must be set to true and the etc mount volume set to false
  • The annotations as shown above must be set to be able to run on Autopilot

Once completed, the Sysdig agent will be installed on the Google-managed cluster nodes and you’ll be able to get started with the capabilities available within Sysdig Secure and Sysdig Monitor. Note: NodeAnalyzer cannot run in Autopilot.

As new nodes are auto-provisioned by GKE Autopilot, the daemonset feature of Kubernetes will likewise auto-provision the Sysdig agent onto these nodes to ensure seamless availability of Sysdig services.

Example use cases for GKE Autopilot with Sysdig

Your organization has defined requirements for security and compliance and of course cares about the health and performance of the applications you roll out onto Kubernetes. Sysdig will help you meet these requirements as you’re operating with GKE Autopilot. Let’s look at a few examples of what you can do.

Admission control

Using Sysdig, you can scan container images in your CI/CD pipelines and registries prior to deployment onto your cluster(s). Along with that, you can create admission controller policies to define the criteria to allow or disallow a container to run on GKE Autopilot.

This means, for instance, you can ensure that only images that meet your criteria as “safe” or that meet your compliance policies are allowed to run. You can easily create policies to achieve this level of control from within the Sysdig Secure interface.

Sysdig Kubernetes Admission Controller

Runtime threat detection and response

Putting in place security threat prevention measures like image scanning and admission control are key to secure DevOps. However, it is also critically important to watch for abnormal and unexpected behavior in production.

You can enable policies in Sysdig Secure to audit different sources of logs and events and alert you to abnormal activity so you can block threats and take corrective action. Sources of events include:

  • Host and Kubernetes events
  • Linux/container system calls
  • Cloud audit logs

Sysdig builds on top of open source Falco to monitor these data sources. Inside Sysdig Secure you can use Falco rules to implement policies to detect behavior across your workloads and GKE Autopilot.

Container runtime security

Since the Sysdig agent hooks into the Linux kernel, it is able to provide visibility into your containers by observing system calls. In this way, you are able to monitor container behavior without requiring any agent code inside your containers.

You can scope threat detection across all of your GKE Autopilot clusters, or you can narrow the scope to specific clusters, namespaces – even specific container types. With this flexibility you are able to tune your container detections to reduce noise and generate fewer “false positives.”

GKE Autopilot runtime policy

In these policies, you can even take the extra step of killing, stopping, or pausing a container that exhibits the unwanted behavior. This will cease the activity and stop the immediate threat. In addition, to provide information and context to accelerate your response, you can trigger a Sysdig capture file when a policy triggers, giving you access to deep system level data that will help you perform forensics.

Kubernetes audit logging

Kubernetes events are an important information source to incorporate in your GKE Autopilot security strategy. With Sysdig, you can configure Kubernetes audit logging features with the admission controller, which intercepts requests to the Kubernetes API. You can then configure Kubernetes Audit Policies that are powered by the Falco engine to filter Kubernetes events.

Using this capability you can answer questions like:

  • What happened? What new pod was created
  • Who did it? The user, user groups, or service account
  • When did it happen? The event timestamp
  • Where did it occur? The namespace that the pod was created in

This activity is captured and displayed in the Events page of Sysdig Secure.

GKE Autopilot runtime policy

GKE Autopilot and Sysdig

GKE Autopilot provides a hands-off, fully-managed solution that empowers you to focus on your workloads while Google Cloud takes care of the rest. Security is still paramount to reducing risk and protecting your assets in the cloud. In addition to the built-in hardening featured with GKE Autopilot, implementing secure DevOps practice with tools like Sysdig Secure will help you run confidently knowing you have visibility into vulnerabilities and behavior that require your attention.

To get started with Sysdig, check out our solution on the Google Cloud Marketplace. You can also start a free trial in minutes – just click here.

Subscribe and get the latest updates