Trending keywords: security, cloud, container,
When you use Kubernetes in production environments, you have a long list of options for deploying containerized applications. One of them is Kubernetes StatefulSets, which allow your data to persist when your application containers cease to exist. This includes databases and other data stores and stateful applications.
To learn more, keep reading this overview and tutorial on K8s StatefulSets. We will start by explaining what StatefulSets are and when to utilize them. Then, we’ll show you how to create, update, and delete them. Finally, we will systematically compare StatefulSets with Deployments, DaemonSets, and ReplicaSets, as well as explain when to use each one.
For the purposes of this tutorial, we created a sample cluster using Kind.
Let’s get started.
What Are Kubernetes StatefulSets
Kubernetes StatefulSet represent a set of pods, each containing unique state requirements. It dictates the needs of dedicated volumes, unique hostname records, and a specific order of deployment. The primary idea behind StatefulSets is to allow developers to deploy applications that require data to be stored in a filesystem with the ability to re-attach to them if they restart by failure. Examples include databases like MySQL, PostgreSQL, and Redis, HTTP servers like NGINX and Apache, and persistent brokers like Kafka and Zookeeper.
When you deploy a StatefulSet, K8s will assign each replica its own state (volumes) and guarantee the order of deployment and updates. For example, if you specify 3 replicas for a StatefulSet, it will deploy them in order and assign each one its own PVCs. When you delete or scale a StatefulSet, this will be done in the same order in which it was first deployed, and it will not delete any of the PVCs so that the safety of the data is ensured.
So, let’s discuss the main reasons why you would use StatefulSets.
When to Use a StatefulSet
You only want to use a StatefulSet when you have specific pod requirements. First, you need to differentiate stateful and stateless applications. In a stateful application, the state is persisted in a file system. Their main responsibility is to manage how the state is accessed. Database systems and applications that use the file system to store information internally are examples of stateful applications.
On the other hand, a stateless application does not hold any client data that can be used in the future or survive after a restart. Examples of stateless applications include software agents, web applications, and lambda functions.
After you have determined which applications are stateful, you want to create a specific deployment strategy for each replica in the StatefulSet. Each replica will be created in order from 0-N and deleted in reverse order from N-0. This makes it possible to set the first pod as primary, for example, and the others as replica pods. The primary pod could handle both read and write requests from the client, and the other pods could sync with the first pod for data replication. When you introduce a new pod by scaling the StatefulSet up, K8s will reserve a new PVC for that pod.
Next, we’ll show you how to create and update StatefulSets.
How to Create a StatefulSet
For this demonstration, we will use the following StatefulSet manifest:
There are a few things to note:
- The Service spec defines a headless service (with clusterIP: None), which means that K8s does not allocate an IP address or forward traffic. Instead, the DNS server will return the individual pod’s IP rather than the service IP (which can be used by the client to connect to any of them).
- You will need to provision a PersistentVolume for the volumeClaimTemplates. Otherwise, it will block on the Pending state.
- The StatefulSet spec uses a special volumeClaimTemplates field that defines which template to use for creating a PVC. Each of the replicas in our example will require a unique PVC.
After you have applied the above specification, you can check the status:
There are many ways to update a StatefulSet. The simplest is to scale the number of replicas up/down by using the following command:
When you scale a StatefulSet down to 0, you can watch the order of operation. It will scale the last pod followed by the second-to-last, and so on:
❯ kubectl scale statefulset web –replicas 0
K8s allows you to customize the behavior of the update strategy using the updateStrategy spec field. You can customize the behavior of the PVC retention using the persistentVolumeClaimRetentionPolicy spec field.
If you try to update an existing StatefulSet spec by changing anything other than replicas, template, or updateStrategy, the operation will fail:
❯ kubectl apply -f stateful-set.yml service/nginx unchanged The StatefulSet "web" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbiddenCode language: PHP (php)
When you delete a StatefulSet, the PVCs bound to it are not deleted by default. This is to ensure data stability. You must reclaim them independently, as follows:
You can reapply the spec, which will create the same pods in order and attach the respective PVCs:
You can inspect the pods associated with the StatefulSet using the following command:
Next, we will explain when you should use Deployments instead of StatefulSets.
StatefulSets vs. Deployments
The key reason to use a StatefulSet is to serve a stateful application. For any other case, it’s recommended that you use a Deployment. When you start a Deployment and specify a PVC, it will be shared by all pod replicas (if the volume is read-only). As we’ve seen, each pod in a StatefulSet gets assigned its own PVC.
When you’re scaling Deployments up or down, K8s does not care about the order. It will trigger them all at once. However, the order matters in a StatefulSet, and K8s will maintain that order when scaling up or down to ensure stability.
StatefulSets vs. DaemonSets
A DaemonSet is a unique kind of resource that K8s assigns to a pod for each Kubernetes node in the cluster. For example, if you have 3 nodes, it will schedule 3 DaemonSets one for each node. You won’t have this behavior by default in a StatefulSet unless you specify a NodeAffinity spec field. You can schedule more pods per node as long as the node has enough resources to handle them.
You want to use a DaemonSet rather than a StatefulSet for cross-cutting services like log or app monitors and sidecars. Typically, those services are considered to be long-running, non-critical apps that help facilitate introspection or monitoring.
StatefulSets vs. ReplicaSets
A ReplicaSet represents a simple replicated pod and is very similar to a Deployment. It uses its pod template much like a StatefulSet uses a pod template. In a ReplicaSet, however, K8s does not handle rolling-updates automatically for you. To update your pods to a new version, you will have to create separate ReplicaSets, then scale them up one-by-one. For most use cases, you should be using Deployments that offer automatic rollouts. Given that ReplicaSets share many commonalities with Deployments, you should consider using them for stateless applications.
This wraps up our deep-dive tutorial on K8s StatefulSets and how to use them in practice. If you liked our content, you can check out our blog to read more upcoming tutorials from Sysdig related to K8s, cloud security, and open source technologies.