What is an Open Policy Agent (OPA)?

Have you ever thought about how many so-called policies (or rules) you’ve configured and have to maintain in your system? For example, you may have configured policies in your application, network, code repository, code deployment, and CI/CD pipeline – and they’re all in different places in your system.

Good news! There is a better way to manage these policies. Let’s take a look!

Let me introduce you to OPA. OPA (pronounced “oh-pa”) stands for Open Policy Agent, which is an open-source, general-purpose policy engine that unifies policy enforcement across the stack. For example, if you have an application with different access levels for admins and regular users, you can use OPA to give different access to different types of users. If you use something like GitHub Actions, you can also use OPA to check whether you’ve correctly defined and allowed the repository that your code pulls the image from, among other things.

What is an Open Policy Agent (OPA)?

What you'll learn

What an Open Policy Agent is and how it works
How to write policies in Rego
How to enable OPA in Kubernetes

Separation of Concerns or Decoupling

Have you ever heard the terms separation of concerns, model-view-controller (MVC), or model-view-presenter (MVP)?

Before these methodologies were introduced, we had something called spaghetti code. In programming, this was where style sheets (CSS) were mixed with application logic (which could be written in PHP, Python, Perl, etc.), or data manipulation (like adding, deleting, or updating records in a database) was defined in the application or business logic. Essentially, they were all in one place.

Having spaghetti code like this makes it harder to make changes on the presentation layer, data layer, and/or application layer, since they are all interconnected. It can also make deploying changes to the production environment a daunting task.

Decoupling helps fix this problem. Separation of concerns, MVC, and MVP are all examples of decoupling. In OPA, the decoupling (separation of concerns) happens with policy decision-making: it decouples the policy from the application logic. This helps mitigate the challenge of managing policies in different distributed and complex systems.

What Is a Policy?

OPA works as a general-purpose policy engine for any service or application that needs a policy decision.

So, what is a policy? As we mentioned at the beginning, you’ll probably have defined “dos and don’ts” in your system – for example, in your application, database logic, presentation layer, networking (firewall), storage accessibility, and many other areas. A policy is nothing but a rule or guideline that is created and implemented to make your system, application, networking, or infrastructure behave in a certain way (and not in other ways).

For instance, let’s say that your application has two different types of users: regular users and admins. You can write a policy to allow admins to have access to admin pages while regular users only have access to non-admin pages. You can also limit access to certain IP addresses or header information, or limit access to only certain container registries (such as GCR.io, Quay.io, or your own in-house container registry).

At this point, you might be wondering if you can just put that logic in your application instead of having a separate policy engine. The answer to that question is yes, you can. But again, the idea is to decouple the policy to separate the concerns, so one team can focus on the policy and another on building the application.

How Does an Open Policy Agent Work?

Now that you know what a policy is, let’s take a look at how OPA works. We’ll start with a simple example.

The following image depicts the flow of an application, including how the policy and OPA handle the user’s request:

What Is an Open Policy Agent (OPA)? — *Figure 1 – Policy Workflow*

This is how OPA works in the case of the image above:

A user sends a request to an application or service.
The request consists of the policy input (JSON) with the value "role": "admin".
OPA receives the input and processes it.
Based on the policy that is defined in OPA, the output is returned with the value "admin_allowed": true, which means this user is allowed to access any admin pages.

Based on the defined policy, there is a mapping between the input document and the output document that represents the correct policy decision. This mapping is defined by the OPA policy.

How Are Open Policy Agents Implemented?

Now that you know what a policy is and how it handles requests and returns results, we’ll show you how to write a policy.

Let’s use the example from Figure 1. The very basic (and simplest) policy would be:

admin_allowed := trueCode language: Perl (perl)

This variable assignment produces admin_allowed to be true, regardless of what the input is. As a result, it will allow all users to access the admin page, regardless of their roles. This is called an unconditional assignment.

Well, this is not what we want, because we know that the admin page could have billing information with credit card numbers or other sensitive information that only specific users should be able to access. What we want is called a conditional assignment. In this case, we want the admin page to be accessible only to users who have admin roles, as follows:

admin_allow := true if {
  input.role == "admin"
}Code language: Perl (perl)

The above is a conditional variable assignment, which takes place only if the conditional block succeeds. But if the policy input does not specify the role admin, the conditional block fails and the rule does not apply. Since the rule does not apply, the output will not have the admin_allow field. So, it is the responsibility of the application or service to understand what the expected output from the policy engine is based on the configured policy.

This is the heart of writing an OPA policy.

OPA Language

Now that we’ve explained what a policy is, let’s dive a bit deeper into OPA language.

In the previous section, we discussed the conditional rule, which makes an assignment to a variable if the conditional block is satisfied. But what about the OR condition? The OR condition can be utilized if you have multiple rules with the same variable name.

admin_allow := true if {
  input.role == "admin"
}

admin_allow := true if {
  input.is_billing_enabled == "yes"
}Code language: Perl (perl)

For example, let’s say that your input role is admin and is_billing_enabled is no. Then, the output is admin_allow is true.

Now let’s take a look at the AND condition. We’ll jump straight to the example:

admin_allow := true if {
  input.role == "admin"
}

can_see_billing := true if {
  input.is_billing_enabled == "yes"
}Code language: Perl (perl)

Here, there are two conditions: one is for admin_allow and the other is for can_see_billing. If both assignment variables are true (which can happen if input.role is admin AND input.is_billing_enabled is yes), then the output of this is true. But, if both of them are false (for example, if input.role is regular AND input.is_billing_enabled is no), then the output is false. Essentially, both assignment variables must be true (in this case, the condition on both variable assignments must also be true) to have a true output; otherwise, it is false.

OPA also allows rule chaining. What is rule chaining? Let us explain.

admin_allow := true if {
  is_billing_enabled == true
}

is_billing_enabled := true if {
  input.cc_info_onfile == "yes"
}

is_billing_enabled := true if {
  input.cc_not_expired == "yes"
}Code language: Perl (perl)

In this situation, an output variable can be used to form other output variables. In our case, the is_billing_enabled can serve as an intermediate variable to determine the admin_allow variable. The is_billing_enabled variable in the example above is sometimes called a helper rule.

Another important thing to notice is that the order of the rules does not matter. For instance, the top rule uses the is_billing_enabled variable that is defined by the rule that follows. This out-of-order sequence is not an issue in OPA. OPA also works perfectly with hierarchical data like Kubernetes output and Terraform plans. We’ll discuss these further below.

Like many programming languages, OPA also supports package. Package is a way to organize policy rules into files called modules. For each module, the package declaration should be defined in order to specify the package path where these collection rules belong. Let’s look at an example:

package policy.access

admin_allow := true if {
  input.role == "admin"
}

admin_allow := true if {
  input.is_billing_enabled == true
}

regular_allow := true if {
  input.role == "regular"
}Code language: Perl (perl)

In the example above, we declared a module to be in the package path, policy.access. Another package can use the reference data.policy.access, as shown below:

package main
import data.policy.access

can_see_billing := true if {
  access.admin_allow == true
}

can_place_order := true if {
  access.regular_allow := true
}Code language: Perl (perl)

As you can see, OPA uses the . (dot) operator to query or access the data.

Rego

Rego is a high-level declarative language that is used to showcase OPA policies. Because of its declarative nature, writing policies in Rego can be a little bit tricky. Explaining Rego from start to finish is beyond the scope of this article, so we’ve added a link to a complete guide to understanding Rego in the reference section at the end of this page. For our purposes, we’ll dive straight in.

Let’s use the example of a conditional assignment or rule (AND) from the previous section:

admin_allow if {
  input.role == "admin"
  input.is_billing_enabled == true
}Code language: Perl (perl)

The expected output is as follows:

True, if input.role is admin and input.is_billing_enabled is true.
{} or false, if
- input.role is not admin and input.is_billing_enabled is false or
- input.role is not admin or input.is_billing_enabled is false.

You can see the code, input, and output in the OPA playground here.

Now let’s take a look at an example of a conditional assignment or rule (OR):

admin_allow := true if {
  input.role == "admin"
}

admin_allow := true if {
  input.is_billing_enabled == true
}Code language: Perl (perl)

The expected output is as follows:

True, if input.role is admin or input.is_billing_enabled is true.
{} or false, if input.role is not admin and input.is_billing_enabled is false.

You can see the code, input, and output in the OPA playground here.

Now let’s discuss sets and greedy searches.

allow[reason] {
  a := input.access[_]
  m := a.mode
  m == "special"
  input.type == "POST"
  reason := sprintf("mode '%s' exists and allowed to special access!", [m])
}Code language: Perl (perl)

First, you need to understand what _ means. In the example above, the operator creates a set from input.access with . (dot) operator. We can target a key from that set and get the value.

Let’s say that we have the following input:

{
  "type": "POST",
  "access": [
    {
      "mode": "regular",
      "type": "member"
    },
    {
      "mode": "special",
      "type": "acct-admin"
    }
  ]
}Code language: Perl (perl)

The _ operator will get all of the content of the array, which is everything between the opening and closing square brackets [ ]. The m := a.mode is a variable assignment to get the mode. Even though there are 2 modes from a := input.access [_], the m == "special" will return true, regardless of whether "mode": "special" is in the first or last sequence of the sets.

This might be a little bit confusing, so we’ll give you a real example to help you understand. Just follow along with the comments on this OPA playground. As we mentioned earlier, there are lots of references that show you how to write in Rego. You can even get some practical experience using the OPA playground. Try it out!

With all of the information you’ve learned in the previous sections, you might be wondering how you can put this knowledge to good use in your organization. For example, how can you use OPA with your cloud infrastructure (like Kubernetes)? How does it work with your code repository (such as Bitbucket, GitLab, GitHub, etc.)? Can you use it with Infrastructure-as-Code (IaC) software like Terraform?

Let’s take a look.

Kubernetes

There are two different ways to enable OPA in Kubernetes: using Admission Controller (by enabling the ValidatingAdmissionWebhook admission controller to be exact) or by using OPA Gatekeeper. The latter is easier to configure. (You can find information about configuring OPA using Gatekeeper here.) Using OPA with Kubernetes entails things like:

Adding mandatory labels on any new namespaces.
Defining repositories that can be used by Kubernetes resources (for security reasons).
Defining resource limits and requests in containers.
You can find other examples on the OPA website.

Below, we’ll show you an example of a mandatory “owner” label on a namespace using Gatekeeper. (We’ll assume that you’ve already configured Gatekeeper in Kubernetes).

First, we need to define a ConstraintTemplate. This ConstraintTemplate is where we define Rego as well as enforcement actions.

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        violation[{"msg": msg, "details": {"missing_labels": missing}}]{
          provided := {label | input.review.object.metadata.labels[label]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("Label is required: %v", [missing])
        }Code language: Perl (perl)

After we’ve created the ConstraintTemplate, we need to create a constraint based on that template.

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: ns-musth-have-gk
spec:
  enforcementAction: warn ### deny(default), dryrun, warn
  match:
    kinds:
apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["gatekeeper"]Code language: Perl (perl)

As you can see, the enforcementAction has three types: deny (which is the default), dryrun, and warn. The type you use will depend on your organization’s workflow. With warn, you will still be able to create a new namespace. The warning will be thrown if you do not provide a label while creating that namespace. For example:

$> kubectl create ns gatekeeper-test
Warning: [ns-must-have-gk] you must provide labels: {"gatekeeper"}
namespace/gatekeeper-test createdCode language: Perl (perl)

GitHub

GitHub is no longer used just as a code repository or for hosting your code. It has progressed from a code repository to a CI/CD tool and artifact repository (and it will possibly have more features in the future).

GitHub introduced GitHub Actions to enable it to function as a CI/CD tool. GitHub Actions is a continuous integration and continuous delivery (CI/CD) platform that allows you to automate your build, test, and deployment pipeline. You can create workflows that build and test every pull request to your repository, or you can deploy merged pull requests to production. This is very helpful because you can have your OPA run against all policies in your repo prior to deploying it to production.

In the flowchart above, you can see that if the OPA test on all policies defined fails, GitHub should prevent the code from being merged until the fix is in place.

Here is an example of the GitHub Actions shown in the flowchart:

name: Run OPA Tests
on: [push]
jobs:
  Run-OPA-Tests:
    runs-on: ubuntu-latest
    steps:
    - name: Check out repository code
      uses: actions/checkout@v3

    - name: Setup OPA
      uses: open-policy-agent/setup-opa@v2
      with:
        version: latest

    - name: Run OPA Tests
      run: opa test tests/*.rego -vCode language: Perl (perl)

Terraform

Terraform is an open-source Infrastructure-as-Code software tool that enables you to safely and predictably create, change, and improve your infrastructure.

The following is an example of a policy that requires all Google projects to have “labels” and an “owner:”

package terraform

import input.tfplan as tfplan
import input.tfrun as tfrun

identifiers {
  r := tfplan.resource_changes[_]

  "google_project" == r.type
  some i, j
  r.instances[i].attributes.labels
  r.instances[j].attributes.labels.owner
}

deny[reason] {
  not identifiers

  reason := "Type of 'google_project' has to have 'labels' and 'owner'."
}Code language: Perl (perl)

Here is the corresponding snippet of the Terraform plan:

{
  "mock":
  {
    "project":
    {
      "tfplan":
      {
       "resource_changes": [
         {
           "mode": "data",
           "type": "vault_generic_secret",
           "name": "gcp-credential",
           "provider": "provider.vault",
           "instances": [
             {
               "schema_version": 0,
               "attributes": {
                 "data": {
                   "value": "BOGUS"
                 },
                 "data_json": "BOGUS",
                 "id": "bogus-id",
                 ...
               }
             }
           ]
         },
         {
           "mode": "managed",
           "type": "google_project",
           "name": "cluster-project",
           "provider": "provider.google",
           "instances": [
             {
               "schema_version": 1,
               "attributes": {
                 "auto_create_network": true,
                 "id": "gcp-opa-project",
                 "labels": {
                   "app": "opa-test",
                   "owner": "acme-inc"
                 },
                 "name": "gcp-opa-project",
                 "org_id": "bogus-org-id",
                 "project_id": "gcp-opa-project",
                 "skip_delete": true,
                 "timeouts": {
                   "create": null,
                   "delete": null,
                   "read": null,
                   "update": null
                 }
               },
               "private": "bogus-private"
             }
           ]
         },
         {
           "mode": "managed",
           "type": "google_project_iam_member",
           "name": "project-iam-member",
           "provider": "provider.google",
           "instances": [
             {
               "schema_version": 0,
               "attributes": {
                 "etag": "bogus-etag",
                 "id": "bogus-id",
                 "member": "bogus-serviceaccount-email",
                 "project": "gcp-opa-project",
                 "role": "roles/owner"
               },
               "depends_on": [
                 "google_project.gcp-opa-project",
                 "google_service_account.terraform-gcp-opa-project"
               ]
             }
           ]
         },
         {
           "mode": "managed",
           "type": "agoogle_storage_bucket_iam_binding",
           "name": "project-iam-member",
           "provider": "provider.google",
           "instances": [
             {
               "schema_version": 0,
               "attributes": {
                 "etag": "bogus-etag",
                 "id": "bogus-id",
                 "member": "bogus-member",
                 "project": "gcp-opa-project",
                 "role": "roles/owner"
               },
               "depends_on": [
                 "google_project.gcp-opa-project",
                 "google_service_account.terraform-gcp-opa-project"
               ]
             }
           ]
         }
       ]
      }
    }
  }
}Code language: Perl (perl)

In the output of the Terraform plan, you can see that:

The second index of resource_changes is a type of google_project.
Based on the policy, the google_project type has to have labels and labels.owner.
If the result of the above is true, then the Terraform plan is passed and the Google project code can be deployed to production.

As you saw a policy example as a first contact, you can continue checking the best practices for apply Terraform security.

Conclusion

Most organizations use multiple policy languages or models to define, protect, and manage their infrastructure. To make this manageable, they need one unified toolset and framework to handle them. Open Policy Agent (OPA) could be the right tool for your organization. It has a wide variety of integrations and use cases, including CI/CD and object storage, and it works with multiple cloud providers and programming languages. In addition, OPA is now a “graduated project” according to the CNCF, which means that there is a high likelihood of future improvements and new features. OPA is also backed by Rego, which is a high-level declarative language. Although the idea of writing in Rego might be a bit intimidating, it can actually be fun once you understand the concept.

What is an Open Policy Agent (OPA)?

What is an Open Policy Agent (OPA)?

What you'll learn

What an Open Policy Agent is and how it works

How to write policies in Rego

How to enable OPA in Kubernetes

Separation of Concerns or Decoupling

What Is a Policy?

How Does an Open Policy Agent Work?

How Are Open Policy Agents Implemented?

OPA Language

Rego

Kubernetes

GitHub

Terraform

Conclusion