How we created the first conversational AI cloud security analyst

By Flavio Mutti - AUGUST 14, 2024

SHARE:

In the rapidly evolving landscape of cybersecurity, the need for a robust and intelligent assistant capable of analyzing, summarizing, and reacting to events is paramount. This is why we designed Sysdig SageTM, our large language model (LLM)-based cloud security analyst, to be an expert in cloud detection and response (CDR). 

Sysdig Sage excels at summarizing complex events and providing clear explanations, which is crucial for identifying and promptly reacting to potential threats. By leveraging the capabilities of Sysdig Sage along with the Sysdig platform, organizations can enhance their security posture and ensure timely intervention in the face of cyber threats.

Sysdig Sage leverages specialized autonomous AI agents that work collaboratively to achieve a given goal. It excels at performing critical tasks as a part of an investigation workflow:

  • Threat identification: Based on curated policy rules, Sysdig Sage retrieves events and collects contextual information (e.g., region, host, namespace, or deployment) and assesses if a specific event is part of a broader security event.
  • Aggregation and summarization: When you potentially have hundreds of runtime events with many labels attached to each, Sysdig Sage provides a quick way to categorize your data based on dynamic, contextual scope. By expressing a query in natural language, you can use Sysdig Sage to filter and scope events for you, speeding up the process of gathering event statistics.
  • Event and behavioral analysis: An event may contain multiple pieces of information, or be part of a broader security event. Sysdig Sage correlates events and empowers you to explore the causes and understand the involved resources.
  • Insight generation: Given an event or a set of events, Sysdig Sage is able to guide and support the user in the analytical process, explaining the reasons behind the occurrence of an event by interpolating security know-how along with specific event details.
  • UI guidance and navigation: Sysdig Sage is aware of what you’re looking at, and it can act as a companion during CDR investigations by bidirectionally interacting with the UI you are already familiar with directly from the chat.
  • Response recommendation: Sysdig Sage is able to give you detailed and personalized remediation recommendations being aware of your security events, your infrastructure, and your cloud resources, and applying security knowledge crafted by security experts.

Designing Sysdig Sage

The design of Sysdig Sage for CDR focuses on harnessing the power of LLMs to provide comprehensive security insights. Building on the foundational capabilities introduced earlier, Sysdig Sage is engineered to:

  • Summarize and explain cybersecurity events with clarity and precision.
  • Present anomalies and potential threats in real time.
  • Provide actionable recommendations for mitigating identified risks.
  • Facilitate quick decision-making through well-structured and concise information.

These functionalities are pivotal for maintaining a proactive defense mechanism, enabling security teams to stay ahead of potential threats.

Furthermore, to boost the capabilities of Sysdig Sage, we designed it to leverage multi-step reasoning and contextual awareness. 

With multi-step reasoning, Sysdig Sage is capable of collecting and analyzing a large amount of information to answer each question. In this way, we enable the user to ask iterative requests that would normally require multiple actions to be accomplished. This also allows Sysdig Sage to perform multi-step analysis supporting the user in the investigation of multiple data sources all at once.

Sysdig Sage

With contextual awareness, we are able to offer a seamless experience between the Sysdig capabilities that the user knows and this new way of “chatting” with your data. And with continuous contextual awareness, the user will be able to ask questions about the data they are looking at any time. We strongly believe that this will supercharge analytics capabilities during runtime events exploration.

Sysdig Sage

The Sysdig Sage assistant internally works in the following steps:

  1. Gather the conversation. The conversation and the latest question are both sent to Sysdig Sage for initial question understanding. The question could refer either to the context of what the user is looking at in the UI, or to an investigation already being conducted about runtime events.
  2. Understand the question and apply safeguards. Sysdig Sage decides if the question can be answered right away or if more reasoning or data are needed. If the question cannot be answered, the question is declined and Sysdig Sage will provide an explanation of why it can’t help in fulfilling the request (e.g., the question is outside the context of functionalities of Sysdig Sage).
  3. Gather information. If the question is complex, Sysdig Sage is able to decompose the question into several actionable steps. For example, if asked to summarize events of the last 24 hours, Sysdig Sage is able to correctly frame the time interval, understanding the scoping that the user has explicitly requested or set in the UI, and then collect the required data from the Sysdig backend.
  4. Generate the answer. Once the data is collected and Sysdig Sage is able to determine that information is available, it then generates the final answer. The answer is then streamed to the user.

We do not exploit any customer data to train or fine-tune LLMs, so your data remains private and protected. Furthermore, we only utilize LLMs that guarantee the highest level of privacy for your data and that do not perform any kind of training with input prompts. 

Instead, to guarantee the highest level of quality, performance, and privacy, we rely on dynamic prompting strategies. The prompts are crafted by security experts and engineers, allowing the LLMs to reason about your security events, retrieving even the most hidden knowledge and patterns.

Testing Sysdig Sage

Testing Sysdig Sage was a critical phase in its development. We deployed it in real-world environments, subjecting it to both simulated and actual cyber attacks. This rigorous testing process allowed us to refine its detection and response capabilities, ensuring reliability and effectiveness under various scenarios. Additionally, we collaborated with key stakeholders, incorporating their insights and expertise to streamline and accelerate relevant user flows. This collaboration ensured that the assistant met the practical needs of end users, providing them with a valuable tool for CDR.

We developed a custom evaluation framework tailored to the unique characteristics of Sysdig Sage. This allowed us to measure its performance for the most complex use cases involving contextual awareness and multi-step reasoning, and against realistic real-time scenarios. The main flow of evaluation went as follows:

  1. Building robust and heterogeneous datasets of conversations along with ground truth.
  2. Providing a simulated but still realistic Sysdig data sandbox environment to run real-time conversations, taking into account the variability of real environments over time.
  3. Building a comprehensive set of tools and evaluation strategies collected in the evaluation framework to allow for proper assessment and reporting.
  4. Keeping the human in the loop so our ML engineers and data scientists can analyze the produced report and iteratively improve the capabilities of Sysdig Sage.

This evaluation process enabled us to improve the capabilities of Sysdig Sage over time, reducing the risk of introducing regressions and enabling continuous assessment of the quality of our system.

Conclusion

Sysdig Sage represents a significant advancement in cybersecurity technology. By integrating sophisticated detection and response features, rigorous testing, and stakeholder collaboration, we have developed a tool that enhances organizational security and facilitates proactive threat management. Its ability to summarize, explain, and react to events swiftly and accurately positions it as an essential asset for any cybersecurity team. More than just an assistant, it supplements your teams with the skills of a cloud security analyst. As cyber threats continue to evolve, so will Sysdig Sage, providing the intelligence and support needed to help you protect against emerging risks.

Subscribe and get the latest updates