If you work in Security or Operations, you are surely familiar with the concept of “alert fatigue.” Alert fatigue Syndrome is the feeling of becoming desensitized to alerts, causing you to potentially ignore or minimize risks and harming your capability to respond adequately to potential security threats.
There are many potential sources for alerts or findings – cloud threats, runtime events, vulnerable images in your pipeline or registry, compliance violations, etc. The sheer volume of it all makes it almost impossible to address everything. We like to compare it to the worst of the fire season in the U.S. West Coast – there are more acres burning than firefighters have the capacity to protect. So where to focus the resources? Where will they have the highest impact or address the highest concerns?
Being inundated with alerts and findings poses real security risks for teams. The potential to miss something important that indicates malicious activity in your environment is high. Identifying the malicious drop in an ocean of frequently repeated alerts is challenging for anyone.
Introducing ToDo, Sysdig’s solution to this major problem that guides users to take the actions that will have the highest impact. It does the work of aggregating resources with similar problems, prioritizing the most impactful actions, and guiding users to take meaningful remediations. This way, teams can easily know what to focus on first.
Take just a few seconds to try out ToDo in our micro-demo right here:
ToDo offers recommendations – a representation of a flow or action that is repeated across resources. Recommendations are shown as a prioritized list for each product area (such as identity, compliance, and other areas of risk). ToDo provides concise, actionable, and impactful recommendations, rather than a list of tickets that grows over time and that are addressed individually one at a time.
ToDo implements the following strategies for the recommendations:
Aggregation – What resources all have the same problem?
In this example, ToDo has identified a failing compliance control that is failed by a large number of resources and impacts multiple requirements across several policies. You can see how many failing resources are distributed per cluster. You can start a remediation flow for all affected resources or resources belonging to a specific cluster.
Impact – What is the most “bang for your buck” action you can take?
In this example, ToDo has identified where a slight change in a Kubernetes IaC (Infrastructure as Code) manifest file can fix a failing compliance control across many resources at once. By utilizing the automation promoted by IaC practices to fix violations, you can reduce the workload of your teams.
Prioritization – What is the most pressing security concern?
ToDo offers two levels of prioritization. Within a product area, the different recommendations are prioritized against each other. And within a specific recommendation, the failing resources are shown as a prioritized list with the riskiest appearing at the top.
In this example, ToDo calls out the users that have been deemed to have a critical risk and do not have MFA (Multifactor Authentication) enabled. A user’s risk is determined by looking at the exposure of their granted permissions and credential attributes, such as access key best practices. These users pose a higher threat to lateral movement by attackers and should be prioritized first.
Many organizations have specific compliance policies or benchmarks that they are required to comply with. In addition to satisfying compliance needs, these policies can have requirements around security best practices. Continuously evaluating your resources against them will help to avoid common misconfigurations. For example, according to Sysdig’s 2022 Cloud Native Security Usage Report, 73% of cloud accounts contain publicly exposed buckets. This is an example of a misconfiguration that having continuous compliance set up for AWS CIS would have identified.
In addition to the IaC Manifest and High Frequency strategies described above, ToDo will offer recommendations to remedy failing controls that are deemed to be very high security risks.
This is one way to prioritize where to start remediating your compliance and posture violations.
Another strategy offered is identifying the easiest way to improve a specific policy score.
In order to improve a policy score, you need to cause a failing requirement to pass. Requirements can be made up of many failing controls, each with their own failing resources. ToDo will identify a requirement with only one control failing that has a minimal amount of associated failing resources. This is the quickest path to improving your Policy Score.
It’s not unusual for a modern cloud environment to include thousands of human users, applications, services, and other assets, each with a unique set of permission and access requirements to do its job.
How do you keep track of the access rights assigned to all of these human and machine users? In particular, how do you ensure that each user has only the level of access privileges necessary and avoid excessive privileges that could lead to security risks like lateral movement?
Lateral movement is seen in almost every major cloud breach. Here is just one example of how cloud lateral movement can be used to break into a vulnerable container. Sysdig’s CIEM (Cloud Infrastructure Entitlements Management) solution looks at CloudTrail logs and analyzes permissions given versus those actually utilized. That information is utilized for a Least Permissive Policy Suggestion, which can be copy/pasted directly into your AWS IAM (Identity and Access Management) Console.
ToDo will provide a prioritized list of your riskiest policies, users, or roles along with these Policy Suggestions, making it easy for you to tackle the security risk of lateral movement by moving towards a Least Permissive model.
In addition, ToDo will also summarize and prioritize other risky attributes around identity, such as unused policies, inactive users and roles, misuse of access keys, and missing MFA. Cleaning up your identity posture can prevent attackers from abusing your credentials or identity to exfiltrate data or crypto mine from your account.
Risks and vulnerabilities
Vulnerability management is one of the worst offenders of inundating your teams with findings. There are many vulnerabilities, so how can you prioritize what to fix first? Here are a few factors to look at:
- What is the vulnerability score?
- Is this image used in production?
- Is this package in execution?
- Are these vulnerabilities exploitable?
Only vulnerabilities that are tied to packages used at runtime offer a real chance of exploitation. Sysdig’s deep visibility into system calls removes all the guesswork from container vulnerability prioritization by accurately identifying vulnerabilities in packages loaded at runtime.
ToDo will call out Workloads at Risk – running workloads with critical vulnerabilities and respective packages in execution.
In addition, this interactive graph maps workloads with high severity vulnerabilities combined with high numbers of security events. Users can toggle between severities of both events and vulnerabilities.
Frameworks like MITRE ATT&CK can be helpful to understand which events to focus on first. ToDo breaks down all MITRE tagged events by cluster, helping your teams to identify areas of concern quickly.
High rates of false positive alerts continue to plague security professionals despite decades of work to reduce them. This creates real risk of becoming desensitized to alerts, causing you to potentially miss the highest priority security threats. Sysdig has developed a new approach.
We have applied an analysis methodology that groups common root cause security issues based on metadata. The use of ToDo for this purpose may be one small step toward alert reduction, offering the potential for a giant leap forward in addressing this decades-old problem.
Try out ToDo in a free trial of Sysdig, and start discovering your highest priority items today.