Introducing Sysdig and Torq – amplify SOC efficiency via automated cloud detection and response

Published by:

Durgesh Shukla

Introducing Sysdig and Torq – amplify SOC efficiency via automated cloud detection and response

Published:

September 26, 2024

Table of contents

Text Link

Attackers born in the cloud

Cloud attackers are swift and sophisticated, requiring robust threat detection and response programs that can keep pace with these malicious actors born in the cloud. They exploit the automation and scale of the cloud, along with new techniques, to accelerate all stages of an attack and inflict damage within minutes.

A pertinent example of cloud attacks executed by these new-age threat actors is SSH-Snake (discovered by Sysdig TRT). SSH-Snake is a self-modifying worm that leverages SSH credentials discovered on a compromised system to start spreading itself throughout the network. The worm automatically searches through known credential locations and shell history files to determine its next move. More detail can be found about SSH-Snake in a deep-dive blog we did post-discovery. We have also done a deep dive into the other sophisticated attacks, such as SCARLETEEL, in the past.

To keep up with such attacks (and attackers), security teams require customized tooling that peers deep into the crevices of the cloud, identifies risk, and helps respond fast enough to contain threats. In fact, because of these attacks, the entire paradigm of what constitutes a great response strategy in the cloud needs to be re-conceived. The 555 Benchmark guides organizations to detect and respond to cloud attacks faster than adversaries can complete them. In short, defenders have five seconds to detect, five minutes to investigate, and five minutes to respond to any cloud attack.

Arming defenders of the cloud with insights and automation

Torq and Sysdig are two companies “born in the cloud” that are partnering to help customers stay ahead of cloud-savvy threat actors. Torq.io is an AI-Driven hyper automation software that helps security teams with automations that accelerate investigation and response for the cloud. When leveraged together with Sysdig, the leader in cloud security powered by runtime insights, customers can get unmatched visibility into the cloud for detections, and automate their incident response workflows to meet the 555 benchmark.

Redefining cloud detection, investigation, and response

Sysdig enables customers to optimize their cloud detection and response (CDR) use cases with automated collection and correlation of all their cloud data, including events, posture misconfigurations, and exploitable vulnerabilities to identities. The cloud context Sysdig provides is unparalleled. An interactive visualization of this context helps analysts instantly conceptualize attacks, unlocking five-minute investigations across the most advanced threats. Some key capabilities to highlight include:

Integration and workflow automation focused on cloud security for the SOC

Sysdig has partnered with Torq with the objective of providing essential out-of-the-box SOC automation workflows as it relates to CDR. Our joint customers can now respond through ready-to-use remediation workflow sets that help achieve the 555 benchmark with instant actions related to each of the steps. They can edit the out-of-the-box templated playbook and also build more sophisticated ones when required. So, the idea is to facilitate (and inspire) purpose-built workflow playbooks that can take specific actions as they relate to real-world cloud threats.

Here is a 10,000-foot view of how data flows within this integration:

Initial security events are gathered from the Sysdig HTTPS notification channel and sent to Torq.
These are triaged in seconds (in real time) by an action set in the Torq workflow that leverages Sysdig APIs. This ensures that the time to investigate and start case enrichment with contextual data is reduced to seconds with nearly instant detection and automation.
Torq uses the specialized context provided by Sysdig (i.e., Kubernetes namespace in case of events related to containerized workloads) to find the best team and assignee in a project/case management software, like JIRA or ServiceNow.
These case tickets are created, triaged, enriched, and assigned to the right team and user seconds after the threat has been captured.
Teams can add auto response steps within Torq to further sharpen the investigation, mitigation, and response strategies.

Here are the various actions that can be taken leveraging this integration:

Query inventory of cloud assets and cloud-native workloads for risk factors related to the deployment topology.
Get image vulnerabilities and runtime insights for container and host images.
Get users related to Kubernetes events.
Retrieve events by ID.
Retrieve the entire relevant events history detected by Sysdig.

Automated cloud investigations and case detail enrichment for the SOC

It is typical for cloud security tools to gather massive amounts of data and security findings. Often, this is a live telemetry of events such as file dumps or captures from containers, cloud services, and identities across multiple cloud service providers. Gathering this data in a consumable format is the critical job that is expected from the SOC analysts, Incident responders, and security threat researchers.

Here is where the utility of a hyper-automation tool like Torq really comes into play. Take, for example, the below screenshot where Sysdig has captured (instantly) the fact that a terminal shell was opened up by an attacker while executing a cloud attack, such as SCARLETEEL or SSH-Snake. Sysdig alerted Torq.

If you check these cases getting created in Torq’s own case management system:

Torq was able to fetch all the details from this Sysdig alert and create a well-formulated JIRA/Torq case ticket for the Incident Response or the Forensics team.

Notice the granularity and depth of the event metadata captured by Sysdig. Typical cloud security tools fail to grasp the details relevant to Kubernetes or containerized workflows. However, Sysdig is able to capture both the cloud and workload details so that security threat researchers can correlate them — either during incident response or forensics. This workflow is not just capturing details from an alert, but also enriching the event details based on the type of the event (Kubernetes or container event).

Torq improves its response workflows based on the context provided by Sysdig, including container details, vulnerability summary, and other relevant details associated with the detected event. Note how Torq is able to consume Sysdig event logs.

Within this workflow playbook, Torq can also query Sysdig APIs to take different actions.

Finally, the below workflow playbook is fully customizable, so a customer can change and modify the different steps when required.

To summarize the data flow:

Initial security events are gathered from the Sysdig HTTPS notification channel, and then immediately triaged in real time by an action set in the Torq workflow that leverages Sysdig APIs. This ensures that the time to investigate and start case enrichment with contextual data is reduced to seconds with nearly instant detection and automation.

Sysdig identifies malicious activity and notifies it to Torq in real time.
Torq workflow queries a Sysdig API (Sysdig inventory API) to extract additional context about the container image, configuration, and its vulnerabilities.
Torq uses the specialized context provided by Sysdig (i.e., Kubernetes namespace) to find the best team and assignee in Jira by querying Atlassian APIs.
A Jira ticket is created, triaged, and assigned to the right team and user seconds after the threat has been captured.

Extra: Teams may want to add auto response mechanisms, like narrowing down the cluster security group as a mitigation strategy, while the team starts the investigation.

Now, imagine that this was a real attack, like an SSH-Snake, and the incident responders were using traditional EDR tools. They would have had no network telemetry and the lack of forensic detail would make the response extremely slow and laborious — especially as it relates to tracking the activity within the compromised workloads.

Leveraging easy-to-implement workflows like this one, Sysdig and TORQ users are not only able to detect complex attacks like SSH-Snake, but also automatically stop threats in only a few seconds! Other response actions like step-up monitoring of suspicious processes or terminating compromised containers are also possible depending on the risk appetite of the organization.

Integration setup

Look for the Sysdig integration within the Torq hyper-automation UI.

Once found, you can implement your own workflow using the predefined Sysdig steps, or select the workflow playbook from the catalog.

Conclusion

It is crucial for companies to implement an investigation and response strategy that takes less than 10 minutes in order to safeguard their cloud environments from malicious threat actors. Cloud security from Sysdig can be turbocharged with the power of project management tools like Jira, CRMs like Salesforce, messaging apps like Slack, and much more by leveraging Torq workflows. Sysdig and Torq have come together to help our customers detect, triage, and respond to the most sophisticated cloud attack techniques. We help customers unlock the power of advanced SOAR workflows – enabling instant detection, automated investigation, data enrichment, correlation, and response.

A word of thanks to the coauthors Manuel Boira, Durgesh Shukla, Ashish Chakrabortty of Sysdig and Eldad Livni of Torq for making this article come to life.

About the author

Cloud Security

Kubernetes & Container Security

featured resources

Test drive the right way to defend the cloud with a security expert

GET A DEMO