Incident response in Kubernetes with Sysdig’s Activity Audit

By Jorge Salamero Sanz - NOVEMBER 12, 2019

SHARE:

Activity Audit is a new feature included in the Secure 3.0 release. This feature speeds incident response and enables audit by correlating container and Kubernetes activity.

Kubernetes audit and incident response use cases

When it comes to incident response for Kubernetes, SOC teams need to be able to analyze an endless list of scenarios:

  1. Show all outbound connections from the billing namespace to an unknown IP address.
  2. Trace a kubectl exec user interaction and list all the command and network activity that happened inside the Pod.
  3. Show every tcpdump command execution that has happened in a host or Kubernetes deployment.

In addition, having an audit trail in Kubernetes is critical for SOC 2, PCI, ISO, and HIPAA compliance. This requires the ability to investigate and store all Kubernetes user, application and Pod activity, even if the container no longer exists.

Activity audit sources

Challenges when performing incident response in Kubernetes

Incident response in Kubernetes is hard because you can’t access the data needed to determine the impact of an event. Investigation and post-mortem analysis are not easy in distributed environments with highly volatile containers: more than fifty percent of containers live less than five minutes.

Logs are often not enough to answer all of the questions. Anything that developers didn’t instrument upfront won’t be reflected on the logging.

In the world of servers and VMs, teams could SSH onto a host to perform further analysis, but this concept doesn’t work for containers, as the asset landscape has evolved:

  • Containers are spun up and down, often within a matter of seconds. An attack could exfiltrate data before a container disappears or is rescheduled onto a different node. When the container dies, any record of that container is also gone.
  • Containers communicate over ad-hoc virtual software-defined networks, and their internal IP addresses change frequently.
  • User and application interactions with the infrastructure are decoupled. The Kubernetes control plane is separated from the workload and handles orchestration, resource access and execution privileges. Kubernetes API audit events provide information on those changes, but do not have visibility into container activity.

Understanding all of the changes in your cluster made by Kubernetes user or service is nearly impossible. Without the ability to map system activity to users or services, security teams have no way to uncover malicious behavior and misconfigurations within Kubernetes. Existing tools provide this information as disparate data points that are not correlated, and as a result security teams have limited visibility and low confidence in being able to answer who did what.

Speed up incident response with Sysdig Secure Activity Audit

Sysdig’s Activity Audit speeds incident response and enables audit for Kubernetes. Sysdig captures relevant information like:

  • executed commands inside the container
  • network connections
  • Kubernetes API events, like users executing kubectl exec

By correlating this information with Kubernetes application context, the SOC team can spot abnormal activity.

Let’s walk through an example scenario and show you how Activity Audit allows you to investigate a security violation in Kubernetes.

Activity Audit on events feed

Any of the out of the box policies, or your configured runtime policies, can trigger a security event when any unusual activity is detected. When you go into the policy events page you can see that “Terminal shell in a container” policy was triggered.

You can drill into the details surrounding this incident, such as Host IP and MAC, container image details, commands, and isolate the specific part of the Kubernetes infrastructure where the violation occurred (particular Namespace, Deployment, Pod).

To dig deeper, you can go to Activity Audit where you can understand the “who”, “what” and “why”.

Activity Audit explore

Here, Activity Audit is able to pinpoint the “Terminal shell in a container” violation and isolate all commands and network activity generated as part of the kubectl session over a specific time period.

The grouping on the left hand side allows you to change the scope and navigate across other workloads in the clusters you are running. The time navigation allows you to go back and forward in the audit log. With this information you can pinpoint activity actions, abnormal behavior and investigate unexpected patterns.

In this example, the kubectl exec is the suspicious event that lead us to start this investigation. From here you can see more details, like:

Activity Audit details

  • IP address from where kubectl was run (66.249.64.55)
  • User information (John Doe) and group he belongs to
  • Kubernetes activity details (resource, subresource, and executed command)

You can also filter by any of these fields, to view all activity from John outside this policy violation, or anything that happened in this container over time, just to name a few examples.

Activity Audit drill down

From here, Sysdig traces what happened inside the container after the kubectl exec: commands and network connections initiated. These are different data sources that you can filter in and out.

In this example, you can see John executed the bash command, then curl command to download a file from the Internet, uncompressed it with tar and gzip and shredded the bash history. You can see the network connection that curl opened to that IP address.

This is just one example of how security teams can conduct incident response and post-mortem analysis in Kubernetes.

Activity Audit complements the existing Sysdig Secure forensics capabilities (Captures), that can record all the pre and post-attack container activity. This allows teams to analyze everything that happened – not only after an incident, but also before, so you can understand the full sequence of activity. Sysdig Secure is the only Kubernetes audit and incident response solution available today.

Conclusion

Activity Audit brings a strong incident response framework to quickly recover from security breaches effectively. It provides a detailed Kubernetes audit trail by correlating system/container level activity with Kubernetes activity like kubectl sessions, so SOC team can spot abnormal activity. It also gives teams an audit trail, that allows them to comply with compliance standards like SOC 2, PCI, ISO, HIPAA even if the container no longer exists.

Subscribe and get the latest updates