Harden your LLM security with OWASP

By Nigel Douglas - SEPTEMBER 19, 2024

SHARE:

Foundationally, the OWASP Top 10 for Large Language Model (LLMs) applications was designed to educate software developers, security architects, and other hands-on practitioners about how to harden LLM security and implement more secure AI workloads. 

The framework specifies the potential security risks associated with deploying and managing LLM applications by explicitly naming the most critical vulnerabilities seen in LLMs thus far and how to mitigate them.

There are a plethora of resources on the web that document the need for and benefits of an open source risk management project like OWASP Top 10 for LLMs

However, many practitioners struggle to discern how cross-functional teams can align to better manage the rollout of Generative AI (GenAI) technologies within their organizations. There’s also the requirement for comprehensive security controls to aid in the secure rollout of GenAI workloads. 

And finally, there’s an educational need around how these projects can help security leadership, like the CISO, better understand the unique differences between OWASP Top 10 for LLMs and the various industry threat mapping frameworks, such as MITRE ATT&CK and MITRE ATLAS.

Understanding the differences between AI, ML, & LLMs

Artificial Intelligence (AI) has undergone monumental growth over the past few decades. If we think as far back as1951, a year after Isaac Asimov published his science fiction concept, “Three Laws of Robotics,” the first AI program was written by Christopher Strachey to play checkers (or draughts, as it’s known in the UK).

Where AI is just a broad term that ecompasses all fields of computer science allowing machines to accomplish tasks similar to human behavior, Machine Learning (ML) and GenAI are two clearly-defined subcategories of AI.

ML was not replaced by GenAI, but rather defined by its own specific use cases. ML algorithms tend to be trained on a set of data, will learn from that data, and often end up being used for making predictions. These statistical models can be used to predict the weather or detect anomalous behavior. They are still a key part of our financial/banking systems and used regularly in cybersecurity to detect unwanted behavior.

GenAI, on the other hand, is a type of ML that creates new data. GenAI often uses LLMs to synthesize existing data and use it to make something new. Examples include services like ChatGPT and Sysdig SageTM. As the AI ecosystem rapidly evolves, organizations are increasingly deploying GenAI solutions — such as Llama 2, Midjourney, and ElevenLabs — into their cloud-native and Kubernetes environments to leverage benefits of high scalability and seamless orchestration in the cloud.

This shift is accelerating the need for robust cloud-native security frameworks capable of safeguarding AI workloads. In this context, the distinctions between AI, machine learning (ML), and LLMs are critical to understanding the security implications and the governance models required to manage them effectively.

OWASP Top 10 and Kubernetes

As businesses integrate tools like Llama into cloud-native environments, they often rely on platforms like Kubernetes to manage these AI workloads efficiently. This transition to cloud-native infrastructure introduces a new layer of complexity, as highlighted in the OWASP Top 10 for Kubernetes and the broader OWASP Top 10 for Cloud-Native systems guidance.

The flexibility and scalability offered by Kubernetes make it easier to deploy and scale GenAI models, but these models also introduce a whole new attack surface to your organization — that’s where security leadership needs to heed warning! A containerized AI model running on a cloud platform is subject to a much different set of security concerns than a traditional on-premises deployment, or even other cloud-native containerized environments, underscoring the need for comprehensive security tooling to provide proper visibility into the risks associated with this rapid AI adoption.

Who is responsible for trustworthy AI?

Newer GenAI benefits will continue to be presented in the years ahead, and for each of these proposed benefits there will be new security challenges to address. A trustworthy AI will need to be reliable, resilient, and responsible for the securing internal data as well as sensitive customer data.

Right now, many organizations are waiting on government regulations, such as the EU AI Act, to be enforced before they start taking serious responsibility for trust in LLM systems. From a regulatory perspective, the EU AI Act is really the first comprehensive AI Law, but will only come into force 2025 — unless there are some unforeseen delays with its implementation. Since the EU’s General Data Protection Regulation (GDPR) was never devised with LLM usage in mind, its broad coverage only applies to AI systems in the form of generalized principles of data collection, data security, fairness and transparency, accuracy and reliability, and accountability.

While these GPDR principles help keep organizations somewhat responsible for proper GenAI usage, there is a clear evolving race for official AI Governance that we are all watching and waiting for answers. Ultimately, responsibility for trustworthy AI lies within shared responsibility of the developers, security engineering teams, and leadership, as they must proactively ensure that their AI systems are reliable, secure, and ethical, rather than waiting for government regulations like the EU AI Act to enforce compliance.

Incorporate LLM security & governance

Unlike the plans in the EU, in the US, AI regulations are included within the broader, existing consumer privacy laws. So, while we are waiting on formally defined governance standards for AI, what do we do in the meantime? The advice is simple; we should implement existing, established practices and controls. While GenAI adds a new dimension to cybersecurity, resilience, privacy, and meeting legal and regulatory requirements, the best practices that have been around for a long time are still the best way to identify issues, find vulnerabilities, fix them, and mitigate potential security issues.

AI asset inventory

It’s important to understand that an AI asset inventory should apply to both internally developed AND external or third-party AI solutions. As such, there is a clear need to catalog existing AI services, tools, and owners by designating a tag in asset management for specific AI inventory. Sysdig’s approach also helps organizations to seamlessly include AI components in the Software Bill of Material (SBOM), allowing security teams to generate a comprehensive list of all the software components, dependencies, and metadata associated with their GenAI workloads. By cataloging AI data sources into arbitrary Sysdig Zones based on the sensitivity of the data (protected, confidential, public), security teams can better prioritize those AI workloads based on their risk severity level.

Posture management

From a posture perspective, you should have a tool that appropriately reports on the findings of OWASP Top 10. With Sysdig, these reports come pre-packaged so that there is no need for custom configuration from end-users, speeding up reports and ensuring more accurate context. Since we are referring to LLM-based workloads running in Kubernetes, it’s still as vital as ever to ensure you are adhering to the various security posture controls highlighted in the OWASP Top 10 for Kubernetes.

OWASP for LLMs

Furthermore, the coordination and mapping of a businesses LLM security strategy to the MITRE ATLAS will also allow that same organization to better determine where its LLM security is covered by current processes, such as API Security Standards, and where additional security holes may exist. MITRE ATLAS, which stands for “Adversarial Threat Landscape for Artificial intelligence Systems,” is a knowledge base powered by real-life examples of attacks on ML systems by known bad actors. Whereas OWASP Top 10 for LLMs can provide guidance on where to harden your proactive LLM security strategy, MITRE ATLAS findings can be aligned with your threat detection rules in Falco or Sysdig to better understand the Tactics, Techniques, & Procedures (TTPs) based on the well-known MITRE ATT&CK architecture.   

Conclusion

Introducing LLM-based workloads into your cloud-native environment expands the existing attack surface for your business. Naturally, as highlighted in the official release of the OWASP Top 10 for LLM Applications, this presents new challenges that require special tactics and defenses from frameworks such as the MITRE ATLAS. 

AI workloads running in Kubernetes also pose problems that are similar to known issues, and where there are already established cybersecurity post reporting, procedures, and mitigation strategies that can be utilized, such as OWASP Top 10 for Kubernetes. Integrating the OWASP Top 10 for LLM in your existing cloud security controls, processes, and procedures should allow your business to considerably reduce its exposure to evolving threats. 

If you think this information was helpful and want to learn more about GenAI security, check out our CTO, Loris Degioanni, speaking to The Cyberwire’s Date Bittner about all things Good vs. Evil in the world of Generative AI

Subscribe and get the latest updates