Trending keywords: security, cloud, container,
Istio is an integration framework for microservices that provides a unified layer for observability, security, and traffic management without introducing changes to any application code.
As a service mesh platform, Istio acts as an infrastructure layer that covers communication patterns between services over a network. While Istio is platform-independent, it is designed to work best with Kubernetes.
In this article, we’ll start by explaining what Istio is and what its key features are. Next, we’ll describe the main Istio components and their purpose. We’ll walk you through the best practices for securing Istio deployments as experienced in real-world environments. Finally, we’ll take a look at how to extend Istio by using addons and integrations.
Let’s get started.
How does Istio work?
Simply put, Istio is a piece of software that allows operators to add capabilities that span multiple services and applications. Those capabilities can include observability, security, access control, traffic management, compliance, and service discovery.
Imagine you have hundreds of application services running in a cloud environment. How would you let those services discover each other, add authentication and security controls, or retry mechanisms or tracing without touching the code every time you make a change? The answer is that you use service mesh software like Istio, which abstracts all the relevant cross-cutting aspects of application delivery.
Istio is of particular importance here because it has first-class support for Kubernetes (K8s). Although a service mesh is independent of a hosting platform, it becomes more valuable when it can integrate with the K8s ecosystem (since most businesses have invested their infrastructure there). As with any adoption of new technology, it’s really important to understand how it works and how it can fit with your business goals. Learning more about Istio’s main features can help you configure it without seeing incidents in production.
Main Istio features
Istio provides a service-to-service mesh that works side-by-side with your individual applications. When you first install and configure Istio, you have a variety of options to consider: how many clusters to use, how big they should be, how many networks each service mesh operates, and so on. By understanding the main capabilities that this system provides, you will be able to map your requirements into tangible results.
These are some of the features that Istio offers:
The following instructions and configurations have been tested with Istio 1.16.1 in Kubernetes 1.25 using Kind and deployed via the istioctl tool.
Observability gives you information about the status of the internal states of a system by examining its outputs. By capturing internal logs, traces, and telemetry events from each sidecar service that runs alongside the application, we can correlate that data and present it in a dashboard.
Istio sends telemetry events using Envoy sidecar proxy, which is a process that runs alongside your applications. A sidecar is a process that sits beside an application and provides service-to-service related functionality without changing the application code.
Istio collects and forwards your application events and metrics to pre-configured tracing tools like Jaeger. You can inspect the status of the tracing configuration by installing the Kiali addon, which is Istio’s web console. Kiali offers real-time observability into the services running within the mesh (including Istio components), and it’s a recommended addon:
istioctl dashboard kiali http://localhost:20001/kialiCode language: Shell Session (shell)
You’ll need to spend some time configuring the recommended modules (like Jaeger, Grafana, and Prometheus) for observability before you can review any tracing information.
For example, you can quickly install Grafana and Prometheus by applying the standalone manifest. Then, use the
istioctl tool to navigate to the dashboard:
istioctl dashboard grafanaCode language: Shell Session (shell)
Istio also generates metrics for every type of traffic in and out of the service mesh and can forward them to Prometheus. You can configure the level and kind of metrics by customizing the telemetry config spec when applying the
IstioOperator. You can also do it by applying an override to a telemetry manifest. Speaking of the
IstioOperator, you can inspect the current manifest using the
istioctl profile dump command that prints it in the console:
istioctl profile dump demo apiVersion: install.istio.io/v1alpha1 kind: IstioOperator spec: components: base: enabled: true cni: …Code language: Shell Session (shell)
Istio also collects logs and can be reviewed by selecting
Workloads, choosing a service, and navigating to the
Since Istio leverages Envoy’s distributed tracing capabilities, you can configure its tracing backends (like Zipkin Lightstep and OpenCensus Agent).
Since securing cloud infrastructure is a top priority for enterprises, the Istio team has a comprehensive set of security controls in place to mitigate any threat. The following features are supported:
- Authentication and authorization: Since Istio runs sidecar proxies on every pod, it can manage authentication and authorization traffic between them. Istio can be configured to pass on security headers, JWT tokens, redirect rules, and enforce security policies. Setting up mutual TLS between pods, including authorization policies using standard DENY/ALLOW rules for access control on workloads, is only a configuration manifest away.
- Identity and certificate management: Istio comes with its own Istio certificate authority (CA) with a root certificate, signing certificate, and key so that it can be used for signing workloads. There is also a configuration for key and certificate rotation which improves the cluster’s baseline security posture.
All traffic that flows in and out of a service or pod is handled transparently by Istio. It adds reliability engineering features like circuit breakers, timeouts, and retries by default, and it can also be configured with a DestinationRule policy. However, the core feature of Istio is an abstraction called VirtualService.
A VirtualService is a custom resource definition that manages the configuration rules that affect traffic routing. Think of it as an internal gateway definition – you specify a path for an incoming request and you can apply rules to forward the path to a destination. You can forward custom headers, tokens, different versions of a backend service, and so on. The following manifest is used to rewrite two prefix paths into a new path:
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: store-route spec: hosts: - store.prod.svc.cluster.local http: - name: "store-routes" match: - uri: prefix: "/catalog" - uri: prefix: "/listing" rewrite: uri: "/shop" route: - destination: host: store.prod.svc.cluster.local subset: v2Code language: JSON / JSON with Comments (json)
Any HTTP requests with paths starting with /catalog/ or /listing/ will be rewritten to /shop and sent to pods with the label “version: v2.”
Using VirtualServices simplifies communication between pods with no changes in the code. Within the codebase, the application will only need to use a common name or path taken from the environment for communicating with other endpoints. For example, in the BookInfo demo application, the ratings service takes the hostnames of the peer services from the environment which it uses for sending updates without hardcoding their hostnames:
If you intend to expose an Istio cluster to the public internet, you might want to configure Istio ingress gateways, which is one of Istio’s traffic management features. You can use either an Istio gateway or a Kubernetes gateway. Gateways can be used as public front-facing endpoints that handle load balancing, TLS termination, and frontend proxies.
Finally, one of the less documented features of Istio is its capability to perform chaos testing. This is a kind of testing that triggers casual but controllable “incidents” within a production infrastructure so that you can test the resiliency of the topology. You can, for example, introduce an HTTP delay for a specific service and test how it performs under load. You could also send specific HTTP error codes and aborting responses which can be problematic to handle in microservices. All of these features make Istio a robust platform for scaling applications in a distributed environment.
Istio can be extended programmatically using Wasm or WebAssembly. It uses an abstraction called
EnvoyFilter that acts like a middleware framework. An
EnvoyFilter is used to add custom business logic and configurations when traffic gets routed through the system.
You need to follow certain conventions when developing these plugins (which are usually written in C++). Then you compile the extension. You can reference the compiled .wasm uri in an
apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: name: example-filter-config spec: configPatches: - applyTo: EXTENSION_CONFIG match: context: SIDECAR_INBOUND patch: operation: ADD value: name: example-filter-config typed_config: '@type': type.googleapis.com/udpa.type.v1.TypedStruct type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm value: config: vm_config: code: remote: http_uri: uri: https://storage.googleapis.com/istio-ecosystem/wasm-extensions/example/1.9.0.wasmCode language: JSON / JSON with Comments (json)
Once deployed, you can inspect the status logs of the
EnvoyFilter within the Kiali dashboard’s
Istio Config option:
Now that you have an understanding of Istio’s features, let’s take a look at its main components in detail.
Istio Control Plane components
Istio is a big platform that consists of several components all working together within a set of boundaries. The Istio control plane is a layer of components that deal with control operations. It works similarly to the K8s control plane and is the brain of the system.
Istiod is a daemon service that runs on the control plane and consolidates the Istio control plane components into a single binary. This simplifies the management and administration of this plane. The following illustration shows the main architecture of Istio:
Pilot is the service that is responsible for traffic management. It actually consists of two modules:
- pilot-agent: Runs in the sidecar or gateway container and bootstraps Envoy Agents on demand.
- pilot-discovery: Has fleet-wide traffic management capabilities such as service discovery.
Citadel is a control plane service that deals with security, authentication, and credential management. Citadel can be configured to enforce policies based on service identity and is also responsible for certificate issuance and rotation. Its current presence in the code is sparse, however. As the Istio team is gradually moving its original functionality to Istiod, Citadel will have a limited feature set in future versions of Istio.
Galley used to be Istio’s configuration management and validation component. However, it is a legacy component in newer versions of Istio, and most of its functionality was ported to Istiod. You will find only sparse references in the codebase.
Istio Data plane components
Istio’s data plane is the layer where the individual sidecar processes are located.
Envoy is an open source service proxy designed specifically for cloud-native applications. Istio employs Envoy as a sidecar process that sits beside an application within the same pod. Istio is responsible for its runtime config and lifecycle using the pilot-agent and pilot-discovery modules. You can inspect the status of the proxies at any time using the
istioctl proxy-status command, and you can check a specific Envoy configuration using
istioctl proxy-config. Using an EnvoyFilter lets you customize the configuration that was generated by Istio Pilot.
Istio Add-ons and integrations
One of the main selling points of any successful piece of software is its extensibility capabilities. Istio is designed to be modular by default and offers an array of extensibility options.
Third-party developers and cloud providers have taken important steps to leverage Istio’s extensibility, and they’ve implemented suitable connectors and addons. We have the following categories of Istio addons and integrations:
Major cloud providers have developed dedicated solutions for Istio-based deployments. For example, GCloud Anthos and RedHat Service Mesh are based on Istio and can work effectively within their respective ecosystems. Although businesses can still use vanilla Istio in their platforms, there are compelling reasons to adopt these native counterparts, as they help offset the management costs of their routine operations.
Third-party tools that provide services for Kubernetes deployments need to be aware of Istio and how it is configured in order to work as expected. For instance, you should be able to use an existing authorization library after you install Istio without having to rewrite any existing rules.
For example, one representative use case involves Casbin, which is an Envoy-based authorization library. Because Istio uses Envoy within the data plane, it would be useful to be able to re-use Casbin policies with Istio. Beginning with Istio 1.9, you can use third-party authorization tools by employing a CUSTOM action that delegates the access control decision to that backend. Similar addons can be used for tracing and monitoring through the use of an ExtensionProvider on the initial
We previously mentioned that Istio can be extended using custom EnvoyFilters and Wasm Plugin Modules. The latter can be dynamically loaded at runtime as well. We recommend using only verified modules for better security and compliance, as it is important to maintain a stable and secure platform without bugs.
Given the sheer complexity and effort required to maintain Istio deployments in production, you need to understand the security controls and best practices that you must follow continuously.
Istio Security and Best Practices
Istio provides a detailed documentation page dedicated to securing Istio workloads. While most of the recommendations are valid, it will help if you adhere to the following points as well:
- Have a solid upgrade plan: The first thing you need to do when configuring Istio for production is to manage updates and upgrades. You need to be on top of any security updates that Istio is experiencing. One good way to do this is by using Kustomize for managing the different versions of the Istio operator. Start by creating a folder for each version of Istio you run (for example, 1.15.4 or 1.16.1), then add the Istio operator config along with instructions for adding a new version. Once you generate the config manifest for each version, you can apply the cluster upgrades incrementally per cluster or environment to make sure there are no problems. This will help you standardize updates across clusters.
- Use only trusted extensions and addons: The ability to utilize external providers and addons is a welcome feature of Istio, but it’s not without risks. If you are considering adding a custom Wasm Module or an EnvoyFilter from GitHub, make sure you perform an audit to understand what it does and whether it has known bugs. The last thing you want is dependency vulnerabilities. It’s very easy for an attacker to use dependency confusion or other software supply chain attacks to infect your systems and steal sensitive data.
- Do not underprovision Istio clusters: Istio consists of several components that require CPU and memory to run efficiently. For the minimum hardware requirements, it’s recommended that you have at least 4 vCPU units and 16GB RAM for a small Istio cluster. If you underprovision your cluster, you will find that the performance of the cluster will deteriorate as the load increases, and you will have timeouts and lose availability of the system services. Make sure you monitor the cluster resource usage and have sufficient hardware capacity when deploying applications and services.
Istio is a very useful open ecosystem for maximizing the operation capabilities of microservice applications deployed in Kubernetes. While it has gone through several transformations and changes throughout the years, it is a trusted tool for many companies and organizations. With its modular architecture, great extensibility, and flexibility, it is the de facto choice for medium to large Kubernetes clusters or any kind of multi/hybrid cloud solutions.