Rule tuning is one of the most important steps during the definition of the security posture. With the detection rules, it’s impossible to use a “one fits all” approach: every customer has a unique environment, with its peculiarities and business needs. So, when a new rule is released it’s crucial to understand the security use case behind the detection and reduce the false positives (FP) as much as possible.
The Threat Research Team constantly checks if noise occurs:
- The same false positives are triggered on different customers; an exception is created in the ruleset.
- The noise is strictly related to the environment of the customer; the tuning needs to be made locally.
Managed Sysdig OOTB rules
The out-of-the-box (OOTB) Sysdig Secure ruleset relies on managed policies. They are groups of rules with different scopes (rules on syscalls, AWS, GPC, or Azure), different severities, and verbosity which are updated periodically with new rules, detection improvements, or tunings.
Policies with rules that are well tuned and actually related to real threats are shipped as enabled by default, but there are some other policies with rules that detect suspicious events that are disabled by default. Customers choose which of the disabled-by-default policies they want to activate and they are good to go.
One of the possible reasons for an increase in the number of alerts received could be due to a change in the detection of a rule or changes in the environment, so it is important to rule-tune periodically to prevent malicious events from being hidden in the noise.
Anatomy of an exception
In Sysdig Secure, it’s possible to reduce the noise that comes from a specific rule by using the exceptions. Let’s consider this Falco rule:
rule: Write below etc desc: This rule detects writing operations under the /etc folder condition: open_write and fd.name startswith “/etc/” exceptions: output: Write below etc detected priority: WARNING tags: Code language: YAML (yaml)
The rule detects any attempt to write on files under the etc. folder. This is suspicious because attackers can try to edit some configurations during their exploitation or add new services.
Once activated, let’s suppose that this rule raises a lot of alerts coming from the nginx service writing its own configurations; the event itself is not malicious, it’s for sure is a false positive.
This is the typical situation where an exception is needed. In order to be correct, an exception must have these fields:
- Name: This field identifies the exception.
- Fields: Fields involved in.
- Comps: Operators used. There is an operator for each field, so the number of the fields and the operator must be equal. The association between these two entities is positional, so for example, the first field goes with the first comp value.
- Values: Here, we will have the list of whitelisted values; also, the association is positional.
So, here an example of a working exception:
name: proc_name_folder fields: [proc.name, fd.name] comps: [“=”, startswith] values: - [“nginx”, “/etc/nginx”] - [“nginx-runner”, “/etc/nginx”]Code language: YAML (yaml)
This can be translated in:
(proc.name=”nginx” and fd.name startswith “/etc/nginx”) or (proc.name=”nginx-runner” and fd.name startswith “/etc/nginx”)Code language: YAML (yaml)
This code will be added to the “exception” field of the rule to make sure this type of event will not raise alerts anymore.
Secrets for better exceptions
There is a list of best practices to follow to write valuable exceptions. They need to be versatile (the same exception could be used to whitelist different types of noise) and not too broad (if an exception is too generic, we can lose visibility on malicious events).
Here a list of useful tips:
- Use at least two fields: This will make your exceptions precise and you’ll avoid broad whitelistings.
- These are the fields that are generally used: proc.name, proc.pname, proc.cmdline, proc.pcmdline, proc.aname (this field matches all the process’ ancestors), fd.name, and container.image.repository.
- Don’t use fields that can change their value often (like container.id, container.name, or hostName).
- Prefer these operators: in, endswith, and startswith.
- Use the contains operator only if necessary: It can be broad and can be generally replaced by startswith or endswith.
- Use “in” instead of “=”: the “in” operator allows you to use lists in exceptions and avoid big sets of similar exceptions. Here is an example:
name: proc_name_folder fields: [proc.name, fd.name] comps: [“=”, startswith] values: - [“nginx”, “/etc/nginx”] - [“nginx-runner”, “/etc/nginx”] - [“nginx-server”, “/etc/nginx”Code language: YAML (yaml)
This can be replaced easily with the following:
name: proc_name_folder fields: [proc.name, fd.name] comps: [in, startswith] values: - [[“nginx”, “nginx-runner”, “nginx-server”], “/etc/nginx”]Code language: YAML (yaml)
- Use the endswith operator for the container.image.repository field, especially for images hosted by cloud providers registries. The first section of this field can change often. Let’s consider the AWS Elastic Container Registry. The images are named like
“aws_account_id.dkr.ecr.region.amazonaws.com/my-repository.”This account_id or the region can change, so it’s better to whitelist the image using the last section of the string. We’ll have:
name: container_image_repo fields: [container.image.repository] comps: [endswith] values: - [“amazonaws.com/my-image”]Code language: YAML (yaml)
- Use the “glob” operator carefully: With this operator, you can have undesired matches on paths.
Handle noise in Sysdig Secure using Tuner
Sysdig Secure has a tool that helps us to add custom values to existing exceptions easily to reduce FPs: the Tuner.
This tool can be configured to work automatically or can be used to create, validate, and apply custom exceptions. All the exceptions will be added in the tuner file that is available by clicking the “Policy” menu and then “Runtime Policy Tuning” button. The file will be opened in an editor, ready to be updated.
Manual tuning can be easily done by using the Tuner tool. It will look at the events and, if the noise can be addressed with one of the existing exceptions, will provide a solution. The following example shows the tool:
You’ll only have to choose the best option and click “Apply tuner Exceptions” for the exception to be in place. It’s also possible to select different options and add multiple exceptions at a time.
Sometimes, the suggested exceptions don’t fit well. They can be too broad, or don’t address the noise in a safe way or are extremely specific. In these cases, customers can add their own exception.
Let’s suppose that this exception is already defined in the rule:
rule: Write below etc exceptions: - name: proc_name_pname_fd_name fields: [proc.name, proc.pname, fd.name] comps: [“=”, contains, endswith]Code language: YAML (yaml)
They only need to edit the tuner file by add something like:
rule: name of the rule that needs tuning exceptions: - name: proc_name_pname_fd_name values: - [value1, value2, value3] append: trueCode language: YAML (yaml)
The append field is important. It’s used to make sure that the exception is added to the existing rule. If the append field is false, the entire rule will be overwritten.
Let’s show an example:
rule: Write below etc exceptions: - name: proc_name_proc_pname_proc.cmdline fields: [proc.name, proc.pname, container.image.repository] comps: [“=”, contains, endswith] values: - [“nginx”, nginx, “/etc/nginx”] append: trueCode language: YAML (yaml)
With this exception, we won’t see any alert triggered by the nginx process, called by one of the nginx processes, and writing under the /etc/nginx folder.
More details on how to write Falco exceptions.
The automatic tuner will use the available exceptions to reduce the amount of the alerts seen for a certain rule, applying the best available exception to handle the noise. The automatic tuner applies only exceptions with at least two fields to avoid broad exceptions.
You can enable this tool in the “Runtime Policy Tuning” page by clicking on the highlighted toggle.
Define your own exceptions
It’s possible that the defined exceptions don’t fit well with the noise, or that we need a new combination of operators and fields to properly address false positives.
This file can be used to define new rules or custom exceptions. If you opt for a new exception, you need to write your allowance here. These are basically Falco exceptions, so you need to define a name, fields, values, and comps, and then use the append field set as true, as seen before.
During the rule’s lifecycle, the tuning step is important. It doesn’t happen just when a new rule is defined. Rather, it is more like a continuous process because the entities (softwares, services, operating systems, and infrastructure components) in the network tend to change. The ruleset needs to be updated each time a change happens or a new detection is introduced.
The tuning phase is crucial to improving visibility. If we remove useless events, we’ll only see suspicious or malicious ones. But we have to be careful with the exceptions. Broad whitelists can hide real malicious activity and, in the meantime, specific exceptions can be only a temporary or partial solution to the noise that can increase the effort needed to solve the problem.