Kubernetes 1.27 – What’s new?

By Víctor Jiménez Cerrada - APRIL 4, 2023
Topics: Open Source

SHARE:

What's new in Kubernetes 1.27

Kubernetes 1.27 is about to be released, and it comes packed with novelties! Where do we begin?

This release brings 60 enhancements, way up from the 37 enhancements in Kubernetes 1.26 and the 40 in Kubernetes 1.25. Of those 60 enhancements, 12 are graduating to Stable, 29 are existing features that keep improving, 18 are completely new, and one is a deprecated feature.

Watch out for all the deprecations and removals in this version!

The main highlight of this release is actually outside Kubernetes. The image registry has moved to a distributed system that will be closer to your datacenter, “no matter” what cloud provider you use.

There are also some security highlights. Seccomp by default in kubelet is now stable, and features such as the new account tokens and the new rules for the admission control continue evolving. Also, new additions like VolumeGroupSnapshot will be a great addition to forensic investigators, even if they were thought out with disaster recovery in mind.

Finally, following the trend we’ve been seeing in the latest releases, there are lots of quality-of-life changes that won’t catch headlines but will make everyone’s jobs easier. For example, being able to update a Pod’s resources without restarting it, improvements on how internal IPs are assigned to services, and new kubectl subcommands via plugins.

We are really hyped about this release!

There is plenty to talk about, so let’s get started with what’s new in Kubernetes 1.27.

Kubernetes 1.27 – Editor’s pick:

These are the features that look most exciting to us in this release (ymmv):

#3720 Freeze k8s.gcr.io image registry

Kubernetes removes a dependency on Google Cloud with this movement, and allows them to serve images from the cloud provider closest to you. It’s a big statement that says “This is not a google project,” and makes the project a bit more open to everyone. Also, it’s a measure that will be effective in many areas: Faster image downloads, less data moved between data centers, and, as a result, a slightly greener cloud.

Víctor Jiménez Cerrada – Content Engineering Manager at Sysdig

#3476 VolumeGroupSnapshot

Being able to take consistent snapshots across all the volumes of your Pod is going to be a game changer for disaster recovery. Now, you won’t have to fear that your app won’t behave correctly because volumes were backed up with seconds of difference.

Going further, this is also gonna be a game changer for security research. When performing a forensics investigation you can now be sure that your snapshot faithfully represents the state of the Pod.

Miguel Hernández – Security Content Engineer at Sysdig

#1287 In-Place Update of Pod Resources

This is a long time coming feature, the original proposal is from 2019. Now you will finally be able to update Pod’s container resources without necessarily restarting the Pod.

The feature, which required updates to the CRI specification, will surely be appreciated by operators dealing with workloads that do not deal well with restarts and might need resource tuning from time to time (e.g., database clusters).

But you will also need to think of the implications: Now Kubernetes schedulers and monitoring tools alike will also need to look at the new fields in the PodStatus to properly evaluate available resources in a node.

Daniel Simionato – Security Content Engineer at Sysdig

#1880 Multiple Service CIDRs

No one likes limits – even less when you are working in a big cluster with lots of services. Having to manage arbitrary limits on the internal IPs of the cluster is not a fun time.

A complete rework on how these internal IPs are assigned has removed some limits, and it’s been done in a way that will provide better insights when querying the cluster resources. This is a very welcomed change for all cluster admins.

You can feel a project is maturing when it can take some time to turn something good into really great.

Javier Martínez – Devops Content Engineer at Sysdig

#3638 Improve kubectl plugin resolution for non-shadowing subcommands

There will always be a balance between providing functionality and keeping a codebase maintainable. Starting with Kubernetes 1.27, developers will be able to provide subcommands in kubectl via plugins. This means a better user experience, as we’ll be able to use the kubectl command we all know without murking its codebase. It’s a win-win.

Devid Dokash – Devops Content Engineer at Sysdig

Deprecations

A few beta APIs and features have been removed in Kubernetes 1.27, including:

Deprecated API versions that are no longer served (and you should use a newer one):

  • Kubeadm v1beta2, kubeadm config migrate can be used to migrate to v1beta3.
  • resource.k8s.io/v1alpha1.PodScheduling: use resource.k8s.io/v1alpha2.PodSchedulingContext.
  • DynamicResourceManagement v1alpha1: use v1alpha2.
  • CSIStorageCapacity: Storage.k8s.io/v1beta1: use v1.

Deprecated. Implement an alternative before the next release goes out:

  • seccomp.security.alpha.kubernetes.io/pod and container.seccomp.security.alpha.kubernetes.io annotations: use the securityContext.seccompProfile field instead.
  • SecurityContextDeny admission plugin.
  • service.kubernetes.io/topology-aware-hints annotation: use service.kubernetes.io/topology-mode.

Removed. Implement an alternative before upgrading:

  • The k8s.gcr.io registry: use registry.k8s.io instead [#3720].
  • Feature gates:
    • IPv6DualStack
    • ExpandCSIVolumes
    • ExpandInUsePersistentVolumes
    • ExpandPersistentVolumes
    • ControllerManagerLeaderMigration
    • CSI Migration
    • CSIInlineVolume
    • EphemeralContainers
    • LocalStorageCapacityIsolation
    • NetworkPolicyEndPort
    • StatefulSetMinReadySeconds
    • IdentifyPodOS
    • DaemonSetUpdateSurge
  • appProtocol: kubernetes.io/grpc.
  • Kube-apiserver flag: --master-service-namespace.
  • CLI Flags: --enable-taint-manager and --pod-eviction-timeout.
  • Kubelet flags: --container-runtime, --master-service-namespace.
  • Azure disk in-tree storage plugin.
  • AWS kubelet credential provider: use ecr-credential-provider.
  • Metrics:
    • node_collector_evictions_number replaced by node_collector_evictions_total
    • scheduler_e2e_scheduling_duration_seconds replaced by scheduler_scheduling_attempt_duration_seconds

Other changes you should adapt your configs for:

  • Kubelet: --container-runtime-endpoint and --image-service-endpoint are migrated to kubelet config.
  • StatefulSet names must be DNS labels, rather than subdomains.
  • The resourceClaims field was modified from set to map.
  • NodeAffinity Filter plugin: Filter does not run when PreFilter returns Skip.
  • Scheduler: Don’t execute certain methods when they are not needed:
    • NodeAffinity‘s Filter
    • InterPodAffinity‘s Filter
    • Score
  • resource.k8s.io/v1alpha1/ResourceClaim now rejects reused UIDs.
  • Kubelet: No longer creates certain legacy iptables rules by default.
  • Kubelet: Default value for MemoryThrottlingFactor is 0.9.
  • Pod API: The field .spec.schedulingGates[*].name requires qualified names. It now mirrors validation rules of .spec.readinessGates[*].name.
  • PodSpec: Rejects invalid ResourceClaim and ResourceClaimTemplate names.
  • resource.k8s.io API: Breaking change in the AllocationResult struct.

#3720 Freeze k8s.gcr.io image registry

Stage: Stable
Feature group: sig-k8s-infra

The official Kubernetes image registry has moved to registry.k8s.io.

The previous registry, k8s.gcr.io, was hosted in Google Cloud. Relying on a single cloud provider in the multi-cloud world we live in nowadays was a bit old fashioned.

With the new registry, the Kubernetes project can provide higher availability, and also mirrors you very closely. Now, you may be downloading the images from the same datacenter you are in. Awesome!

The k8s.gcr.io registry will be frozen on April 3, 2023. And it will stay in this state:

  • The last 1.23 release on k8s.gcr.io will be 1.23.18 (1.23 goes EoL before the freeze).
  • The last 1.24 release on k8s.gcr.io will be 1.24.12.
  • The last 1.25 release on k8s.gcr.io will be 1.25.8.
  • The last 1.26 release on k8s.gcr.io will be 1.26.3.
  • 1.27 is expected to be released on April 12, 2023 (so it won’t be available).

You can check if your cluster have dependencies on the old image registry by running the following command:

kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" |\
tr -s '[[:space:]]' '\n' |\
sort |\
uniq -cCode language: Bash (bash)

Find more details in the Kubernetes official announcement.

Kubernetes 1.27 API

#3488 CEL for Admission Control

Stage: Alpha
Feature group: sig-api-machinery
Feature gate: ValidatingAdmissionPolicy Default value: false

Building on #2876 CRD validation expression language, this enhancement provides, since Kubernetes 1.26, a new admission controller type (ValidatingAdmissionPolicy) that allows implementing some validations without relying on webhooks.

For example: “Deny requests for deployments with five replicas or less.”

In Kubernetes 1.27, new features have been added, like:

  • Message expressions to return a message when policies are rejected.
  • New validationActions field to define whether to deny, warn, and/or audit requests.
  • New auditAnnotations to add extra information on the audit events.

Read more in our “What’s new in Kubernetes 1.26” article.

#3716 CEL-based admission webhook match conditions

Stage: Net New to Alpha
Feature group: sig-api-machinery
Feature gate: AdmissionWebhookMatchConditions Default value: false

Related to #3488 CEL for Admission Control, a new matchConditions field has been added in case you need fine-grained request filtering.

For example, by adding the following condition on a ValidatingWebhookConfiguration:

matchConditions:
  - name: 'exclude-kubelet-requests'
    expression: '!("system:nodes" in request.userInfo.groups)'
Code language: YAML (yaml)

This would exclude any request coming from the nodes (from the kubelet).

#3157 Allow informers for getting a stream of data instead of chunking

Stage: Net New to Alpha
Feature group: sig-api-machinery
Feature gate: WatchList Default value: false

There is an edge case that could enable OOM attacks against large Kubernetes clusters. When a client performs a LIST request (consistent snapshot of data), memory consumption is unpredictable and high.

This enhancement provides WATCH requests as an alternative to LIST requests. In addition to implementing optimizations that reduce memory usage, data is streamed as it’s fetched in an event-based manner.

This new method reduces memory usage in several orders of magnitude. However, clients now need to adapt to a new paradigm of listing.

If you are interested in the implementation details, head for the KEP. There is a comprehensive dive in.

#2885 Server Side Unknown Field Validation

Stage: Graduating to Stable
Feature group: sig-api-machinery
Feature gate: ServerSideFieldValidation Default value: true

Currently, you can use kubectl –validate=true to indicate that a request should fail if it specifies unknown fields on an object. This enhancement summarizes the work to implement the validation on kube-apiserver.

Read more in our “What’s new in Kubernetes 1.23” article.

#2896 OpenAPI v3

Stage: Graduating to Stable
Feature group: sig-api-machinery
Feature gate: OpenApiv3 Default value: true

This feature adds support to kube-apiserver to serve Kubernetes and types as OpenAPI v3 objects. A new /openapi/v3/apis/{group}/{version} endpoint is available. It serves one schema per resource instead of aggregating everything into a single one.

Read more in our “Kubernetes 1.23 – What’s new?” article.

#3352 Aggregated Discovery

Stage: Graduating to Beta
Feature group: sig-api-machinery
Feature gate: AggregatedDiscoveryEndpoint Default value: true

Every Kubernetes client like kubectl needs to discover what APIs and versions of those APIs are available in the kubernetes-apiserver. For that, they need to make a request per each API and version, which causes a storm of requests.

This enhancement aims to reduce all those calls to just two.

Read more in our “What’s new in Kubernetes 1.26” article.

#2876 CRD Validation Expression Language

Stage: Graduating to Beta
Feature group: sig-api-machinery
Feature gate: CustomResourceValidationExpressions Default value: true

This enhancement implements a validation mechanism for Custom Resource Definitions (CRDs) as a complement to the existing one based on webhooks.

These validation rules use the Common Expression Language (CEL) and are included in CustomResourceDefinition schemas, using the x-kubernetes-validations extension.

Read more in our “What’s new in Kubernetes 1.23” article.

Apps in Kubernetes 1.27

#3140 TimeZone support in CronJob

Stage: Graduating to Stable
Feature group: sig-apps
Feature gate: CronJobTimeZone Default value: true

This feature honors the delayed request to support time zones in the CronJob resources. Until now, the Jobs created by CronJobs are set in the same time zone: the one on which the kube-controller-manager process was based.

Read more in our “What’s new in Kubernetes 1.24” article.

#3017 PodHealthyPolicy for PodDisruptionBudget

Stage: Graduating to Beta
Feature group: sig-apps
Feature gate: PDBUnhealthyPodEvictionPolicy Default value: true

A PodDisruptionBudget allows you to communicate some minimums to your cluster administrator to make maintenance tasks easier, like “Do not destroy more than one of these” or “Keep at least two of these alive.”

The new PodHealthyPolicy allows you to expand these hints to unhealthy pods. For example, pods that are Running but not Ready.

Read more in our “What’s new in Kubernetes 1.26” article.

#3715 Elastic Indexed Jobs

Stage: Net New to Beta
Feature group: sig-apps
Feature gate: ElasticIndexedJob Default value: true

Indexed Jobs were introduced in Kubernetes 1.21 to make it easier to schedule highly parallelizable Jobs.

However, once created, you cannot change the number of jobs (spec.completions), or how many you want to run in parallel (spec.parallelism). This is quite an issue in some workloads like deep learning.

This enhancement will allow for those fields (spec.completions and spec.parallelism) to be mutable with some limitations (they need to be equal).

#3335 Allow StatefulSet to control start replica ordinal numbering

Stage: Graduating to Beta
Feature group: sig-apps
Feature gate: StatefulSetStartOrdinal Default value: true

StatefulSets in Kubernetes currently number their pods using ordinal numbers, with the first replica being 0 and the last being spec.replicas.

This enhancement adds a new struct with a single field to the StatefulSet manifest spec, spec.ordinals.start, which allows to define the starting number for the replicas controlled by the StatefulSet.

Read more in our “What’s new in Kubernetes 1.26” article.

#3329 Retriable and non-retriable Pod failures for Jobs

Stage: Major Change to Beta
Feature group: sig-apps
Feature gate: JobPodFailurePolicy Default value: true
Feature gate:
PodDisruptionsCondition Default value: true

This enhancement allows us to configure a .spec.podFailurePolicy on the Jobs‘s spec that determines whether the Job should be retried or not in case of failure. This way, Kubernetes can terminate Jobs early, avoiding increasing the backoff time in case of infrastructure failures or application errors.

Read more in our “What’s new in Kubernetes 1.25” article.

#1847 Auto remove PVCs created by StatefulSet

Stage: Graduating to Beta
Feature group: sig-apps
Feature gate: StatefulSetAutoDeletePVC Default value: true

A new, optional, .spec.persistentVolumeClaimRetentionPolicy field has been added to control if and how Persistent Volume Claims (PVCs) are deleted during the lifecycle of a StatefulSet.

Read more in our “What’s new in Kubernetes 1.23” article.

Kubernetes 1.27 Auth

#3299 KMS v2 Improvements

Stage: Graduating to Beta
Feature group: sig-auth
Feature gate: KMSv2 Default value: true

This new feature aims to improve performance and maintenance of the Key Management System.

Currently, tasks like rotating a key involve multiple restarts of each kube-apiserver instance. This is necessary so every server can encrypt and decrypt all their secrets using the new key. This is a resource-consuming task that can leave the cluster out of service for some seconds.

This feature enables auto rotation of the latest key.

Warning: Disable this version before upgrading to Kubernetes 1.27. Implementation has changed so much that it is incompatible with 1.25, and upgrading with the feature enabled can result in data loss.

Read more in our “What’s new in Kubernetes 1.25” article.

#2799 Reduction of Secret-based Service Account Tokens

Stage: Graduating to Beta
Feature group: sig-auth
Feature gate: LegacyServiceAccountTokenNoAutoGeneration Default value: true

API credentials are now obtained through the TokenRequest API, are stable since Kubernetes 1.22, and are mounted into Pods using a projected volume. They will be automatically invalidated when their associated Pod is deleted.

Read more in our “Kubernetes 1.24 – What’s new?” article.

#3325 Auth API to get self user attributes

Stage: Graduating to Beta
Feature group: sig-auth
Feature gate: APISelfSubjectAttributesReview Default value: true

We are now allowed to do a typical /me to know our own permissions once we are authenticated in the cluster by running kubectl alpha auth whoami.

Read more in our “What’s new in Kubernetes 1.26” article.

CLI in Kubernetes 1.27

#3638 Improve kubectl plugin resolution for non-shadowing subcommands

Stage: Net New to Alpha
Feature group: sig-cli
Environment variable: KUBECTL_ENABLE_CMD_SHADOW Default value: false

This enhancement allows developers to expand kubectl functionality, allowing admins to use plugins via subcommands.

This is useful, for example, to help admins manage the increasing number of CustomResourceDefinitions. At the same time, implementing this functionality via plugins will keep the kubectl code focused and easy to maintain.

New subcommands won’t be able to overwrite the existing ones. For example, when running:

$ kubectl create fooCode language: Bash (bash)

Then, as the create foo subcommand doesn’t exist, kubectl will try to locate and run the kubectl-create-foo plugin.

#3659 ApplySet: kubectl apply –prune redesign and graduation strategy

Stage: Alpha
Feature group: sig-cli
Environment variable: KUBECTL_APPLYSET Default value: 0

The kubectl apply command allows you to create or update objects from a deployment yaml file. If you want to destroy objects that are no longer part of the deployment, you can use the --prune, and kubectl will make its best effort to delete them.

However, this process is not perfect and it often leads to object leaking.

This enhancement aims to deprecate the current --prune flag and replace it with a new approach that performs better, leading to less surprises.

You’ll now be able to combine the objects in your deployments into ApplySet groups, providing explicit hints to kubectl on what objects must actually be pruned.

These hints will be provided by labels prefixed by applyset.k8s.io. For example, the applyset.k8s.io/part-of label will let you define what ApplySet an object is part of.

If you are further interested, there is a detailed explanation in the KEP of the problems with prune, and the new proposed labels.

#2227 default container annotation that is to be used by kubectl #2227

Stage: Graduating to Stable
Feature group: sig-cli

A new kubectl.kubernetes.io/default-container annotation has been added to Pod to define the default container.

This simplifies using tools like kubectl logs or kubectl exec on pods with sidecar containers.

Read more in our “What’s new in Kubernetes 1.21” article.

#2590 Add subresource support to kubectl

Stage: Graduating to Beta
Feature group: sig-cli

Some kubectl commands like get, patch, edit, and replace will now contain a new flag --subresource=[subresource-name], which will allow fetching and updating status and scale subresources for all API resources.

Read more in our “What’s new in Kubernetes 1.24” article.

#3515 OpenAPI v3 for kubectl explain

Stage: Graduating to Beta
Feature group: sig-cli
Environment variable: KUBECTL_EXPLAIN_OPENAPIV3 Default value: false

This enhancement allows kubectl explain to gather the data from OpenAPIv3 instead of v2.

Read more in our “What’s new in Kubernetes 1.26” article.

Kubernetes 1.27 Instrumentation

#1748 Expose metrics about resource requests and limits that represent the pod model

Stage: Graduating to Stable
Feature group: sig-instrumentation

The kube-scheduler now exposes more metrics on the requested resources and the desired limits of all running pods. This will help cluster administrators better plan capacity and triage errors.

Read more on the release for 1.20 in the What’s new in Kubernetes series.

#647 API Server Tracing

Stage: Graduating to Beta
Feature group: sig-instrumentation
Feature gate: APIServerTracing Default value: true

This enhancement improves the API Server to allow tracing requests using OpenTelemetry libraries and the OpenTelemetry format.

Read more in our “What’s new in Kubernetes 1.22” article.

#3466 Kubernetes Component Health SLIs

Stage: Graduating to Beta
Feature group: sig-instrumentation
Feature gate: ComponentSLIs Default value: true

Starting with Kubernetes 1.26, a new endpoint /metrics/slis is available on each component exposing their Service Level Indicator (SLI) metrics in Prometheus format.

For each component, two metrics will be exposed:

  • A gauge, representing the current state of the healthcheck.
  • A counter, recording the cumulative counts observed for each healthcheck state.

Read more in our “What’s new in Kubernetes 1.26” article.

#3498 Extend metrics stability

Stage: Graduating to Beta
Feature group: sig-instrumentation

In Kubernetes 1.26, two new classes of metrics were added:

  • beta: For metrics related to beta features. They may change or disappear, but they are in a more advanced development state than the alpha ones.
  • internal: Metrics for internal usage that you shouldn’t worry about, either because they don’t provide useful information for cluster administrators, or because they may change without notice.

Read more in our “What’s new in Kubernetes 1.26” article.

Related: #1209 Metrics stability enhancement in Kubernetes 1.21.

Network in Kubernetes 1.27

#3705 Cloud Dual-Stack –node-ip Handling

Stage: Net New to Alpha
Feature group: sig-network
Feature gate: CloudDualStackNodeIPs Default value: false

This enhancement will implement the --node-ip argument in kubelet for clusters in cloud providers (using the --cloud-provider flag).

You’ll be able to specify one IPv4 address, one IPv6 address, or both. This is the IP the cluster will use to communicate with the node.

Of course, your cloud provider must also support this option. All that kubelet does is set a kubernetes.io/provided-node-ip label that the cloud provider can choose how to implement.

#3668 Reserve nodeport ranges for dynamic and static allocation

Stage: Net New to Alpha
Feature group: sig-network
Feature gate: ServiceNodePortStaticSubrange Default value: false

When using a NodePort-type Service to expose a Pod to outside the cluster, you can choose to either use a Static specific port or just let the cluster select a port for you.

However, if you choose a specific port and it’s already in use, you’ll get an error.

This enhancement enables a “failsafe” range. If the port you choose is in use, another one in a well-known range will be used. You can define this range with the service-node-port-range argument in kube-apiserver. The default is:

--service-node-port-range 30000-32767
Code language: Bash (bash)

#1880 Multiple Service CIDRs

Stage: Net New to Alpha
Feature group: sig-network
Feature gate: MultiCIDRServiceAllocator Default value: false

This enhancement comprises the work done to refactor how internal IP addresses are assigned to Services. This will remove the current limits on IP ranges (service-cluster-ip-range). It will also make it possible to inspect the IP addresses assigned to their Services using kubectl:

$ kubectl get services
NAME         TYPE        CLUSTER-IP        EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   2001:db8:1:2::1   <none>        443/TCP   3d1h
$ kubectl get ipaddresses
NAME              PARENTREF
2001:db8:1:2::1   services/default/kubernetes
2001:db8:1:2::a   services/kube-system/kube-dns
Code language: Bash (bash)

This has been implemented by replacing the current etcd allocation map with the use of IPAddress objects that will be referenced from each Service object.

#3178 Cleaning up IPTables Chain Ownership

Stage: Graduating to Beta
Feature group: sig-network
Feature gate: IPTablesOwnershipCleanup Default value: false

With the removal of Dockershim, some extra cleanup is being done. In particular, kubelet and kube-proxy are being refactored so kubelet is no longer the one creating IPTables chains.

Read more in our “What’s new in Kubernetes 1.25” article.

#3453 Minimizing iptables-restore input size

Stage: Graduating to Beta
Feature group: sig-network
Feature gate: MinimizeIPTablesRestore Default value: true

This enhancement aims to improve the performance of kube-proxy. It will do so by only sending the rules that have changed on the calls to iptables-restore, instead of the whole set of rules.

Read more in our “What’s new in Kubernetes 1.26” article.

#3458 Remove transient node predicates from KCCM’s service controller

Stage: Net New to Beta
Feature group: sig-network
Feature gate: StableLoadBalancerNodeSet Default value: true

When the Kubernetes cloud controller manager (KCCM) removes Nodes from the “load balancers’ node set,” it terminates all the connections to the node instantly.

With this enhancement, the “load balancers’ node set” is not always updated, letting the cloud providers implement a more graceful shutdown on their load balancers.

Kubernetes 1.27 Nodes

#3673 Kubelet limit of Parallel Image Pulls

Stage: Net New to Alpha
Feature group: sig-node
Kubelet config option: serializeImagePulls Default value: true
Kubelet config option:
maxParallelImagePulls Default value: 0

You can currently set serializeImagePulls to false in the kubelet configuration to allow container images to be downloaded in parallel. However, there wasn’t a way to set a limit to how many images were downloaded at the same time.

This could cause spikes on network or disk usage that would affect the performance of a cluster.

Now, you can set maxParallelImagePulls to a number different from 0 in the kubelet configuration to limit the number of parallel image pulls.

#1287 In-Place Update of Pod Resources

Stage: Net New to Alpha
Feature group: sig-node
Feature gate: InPlacePodVerticalScaling Default value: false

This feature allows changing container resource requests and limits without restarting the Pod.

The current fields in the pod spec, Pod.Spec.Containers[i].Resources, become a declaration of the desired state of pod resources. Meanwhile, new fields are added to the Pod.Status sub-resource:

  • .ContainerStatuses[i].ResourcesAllocated: Describe the resources allocated to the containers.
  • .ContainerStatuses[i].Resources: For the actual resources held by the containers.
  • .Resize: With details on resource resizing operations progress.

The containers spec in the Pod also can specify ResizePolicy for CPU and memory, with possible values Restart and RestartNotRequired, detailing whether or not a restart of the container is needed to properly apply new values.

#2570 Support Memory QoS with Cgroups v2

Stage: Alpha
Feature group: sig-node
Feature gate: MemoryQoS Default value: false

It’s now possible to use Cgroupsv2 to configure Memory QoS to kill and throttle processes depending on their memory usage.

Read more in our “What’s new in Kubernetes 1.22” article.

#3695 Extend PodResources to include resources from Dynamic Resource Allocation (DRA)

Stage: Net New to Alpha
Feature group: sig-node
Feature gate: KubeletPodResourcesGet Default value: false
Feature gate:
KubeletPodResourcesDynamicResources Default value: false

With this enhancement, the new resources added on “#3063 dynamic resource allocation” are now exposed to the API by the kubelet.

#3063 Dynamic resource allocation

Stage: Alpha
Feature group: sig-node
Feature gate: DynamicResourceAllocation Default value: false

The scheduler can now keep track of resource claims like FPGAs or shared GPUs, and only schedule Pods in those nodes with enough resources available.

Read more in our “What’s new in Kubernetes 1.26” article.

#127 Support User Namespaces in stateless pods

Stage: Alpha
Feature group: sig-node
Feature gate: UserNamespacesSupport Default value: false

Bringing user namespaces to the Kubernetes ecosystem opens a new range of possibilities to improve security. For example, you can now allow too demanding containers to believe they are running in privileged mode.

Read more in our “What’s new in Kubernetes 1.25” article.

#2053 Add downward API support for hugepages

Stage: Graduating to Stable
Feature group: sig-node
Feature gate: DownwardAPIHugePages Default value: true

Pods are now able to fetch information on their hugepage requests and limits via the downward API. This keeps things consistent with other resources like cpu, memory, and ephemeral-storage.

Read more in our “What’s new in Kubernetes 1.20” article.

#2413 Kubelet option to enable seccomp by default

Stage: Graduating to Stable
Feature group: sig-node
Feature gate: SeccompDefault Default value: true

Kubernetes now increases the security of your containers, executing them using a Seccomp profile by default.

Read more in our “What’s new in Kubernetes 1.22” article.

#693 Node Topology Manager

Stage: Graduating to Stable
Feature group: sig-node
Feature gate: TopologyManager Default value: true

Machine learning, scientific computing, and financial services are examples of systems that are computational intensive and require ultra-low latency. These kinds of workloads benefit from isolated processes to one CPU core rather than jumping between cores or sharing time with other processes.

The node topology manager is a kubelet component that centralizes the coordination of hardware resource assignments. The current approach divides this task between several components (CPU manager, device manager, and CNI), which sometimes results in unoptimized allocations.

Read more on the release for 1.16 in the What’s new in Kubernetes series.

#2727 Add gRPC probe to Pod.Spec.Container.{Liveness,Readiness,Startup}Probe

Stage: Graduating to Stable
Feature group: sig-node
Feature gate: GRPCContainerProbe Default value: true

This enhancement allows configuring gRPC (HTTP/2 over TLS) liveness probes to Pods.

The liveness probes added in Kubernetes 1.16 allow to periodically check if an application is still alive.

In Kubernetes 1.23, support for the gRPC protocol was added.

#2238 Add configurable grace period to probes

Stage: Graduating to Stable
Feature group: sig-node
Feature gate: ProbeTerminationGracePeriod Default value: true

This enhancement introduces a second terminationGracePeriodSeconds field, inside the livenessProbe object, to differentiate two situations: How much should Kubernetes wait to kill a container under regular circumstances, and when is the kill due to a failed livenessProbe?

Read more in our “Kubernetes 1.21 – What’s new?” article.

#3386 Kubelet Evented PLEG for Better Performance

Stage: Graduating to Beta
Feature group: sig-node
Feature gate: EventedPLEG Default value: true

This enhancement reduces the CPU usage of the kubelet when keeping track of all the pod states. It does so by relying on notifications from the Container Runtime Interface (CRI) as much as possible.

Read more in our “What’s new in Kubernetes 1.26” article.

Scheduling in Kubernetes 1.27

#3838 Mutable Pod scheduling directives when gated

Stage: Graduating to Beta
Feature group: sig-scheduling
Feature gate: PodSchedulingReadiness Default value: true

Since “#3521 Pod scheduling readiness,” Pods can define when they are ready to be scheduled.

This feature makes these scheduling gates mutable so other components, like custom scheduling controllers, can use them to expand the features of kube-scheduler. For example, adding one of these scheduling gates at the admission controller, then removing it once there are resources available.

Some restrictions apply to this mutability:

  • .spec.nodeSelector: Only accepts additions.
  • .spec.affinity.nodeAffinity: Can only be set if it’s nil.
  • NodeSelectorTerms: Can be set if nil. If not, it only accepts specific additions.
  • .preferredDuringSchedulingIgnoredDuringExecution: Accepts all updates.

#3521 Pod Scheduling Readiness

Stage: Graduating to Beta
Feature group: sig-scheduling
Feature gate: PodSchedulingReadiness Default value: true

This enhancement aims to optimize scheduling by letting the Pods define when they are ready to be actually scheduled.

Read more in our “What’s new in Kubernetes 1.26” article.

#3243 Respect PodTopologySpread after rolling upgrades

Stage: Graduating to Beta
Feature group: sig-scheduling
Feature gate: MatchLabelKeysInPodTopologySpread Default value: true

PodTopologySpread facilitates a better control on how evenly distributed Pods that are related to each other are. However, when rolling out a new set of Pods, the existing – soon to disappear – Pods are included in the calculations, which might lead to an uneven distribution of the future ones.

This enhancement adds the flexibility, thanks to a new field in the Pod spec and its calculation algorithm, of considering arbitrary labels included in the Pods definition, enabling the controller to create more precise sets of Pods before calculating the spread.

Read more in our “What’s new in Kubernetes 1.25” article.

#2926 Mutable scheduling directives for suspended Jobs

Stage: Graduating to Stable
Feature group: sig-scheduling
Feature gate: JobMutableNodeSchedulingDirectives Default value: true

Since Kubernetes 1.23, it is possible to update the node affinity, node selector, tolerations, labels, and annotations fields in a Job’s pod template before it starts. That way, you can influence where the Pods will run, like all in the same zone, or in nodes with the same GPU model.

Read more in our “What’s new in Kubernetes 1.23” article.

Kubernetes 1.27 storage

#3476 VolumeGroupSnapshot

Stage: Graduating to Alpha
Feature group: sig-storage
Flag: enable-volume-group-snapshot

This enhancement adds support in the Kubernetes API to create a snapshot of multiple volumes together, in a way that they are consistent, and prevent data loss.

Take the example of an application that is using different volumes, for example, one for data and another for logs. If you create a snapshot of both volumes at different times, the application may not behave consistently when performing a disaster recovery.

CSI drivers must support the CREATE_DELETE_GET_VOLUME_GROUP_SNAPSHOT capability in order to perform a VolumeGroupSnapshot.

#3756 Robust VolumeManager reconstruction after kubelet restart

Stage: Net New to Beta
Feature group: sig-storage
Feature gate: NewVolumeManagerReconstruction Default value: true

This enhancement is the part from “#1710 SELinux relabeling using mount options” that reduces the time kubelet takes to re-track mounted volumes after it’s restarted.

After the kubelet is restarted, it no longer knows what volumes it mounted for the running Pods. It can restore this state by crossing information from the API server with data from the host’s OS. It checks what Pods should be running, and what volumes are actually mounted.

However, this process is not perfect, as it sometimes fails to clean up some volumes. It’s also missing some key information, like what mount options the previous kubelet used to mount the volumes.

So, after all, this enhancement is kind of bug fixing: “Just make it work as expected.” It involves a considerable refactor, and behavior will change from the current one. That’s why it’s split into a different enhancement, and has its own feature flag so admins can roll back to the previous behavior.

#2485 ReadWriteOncePod PersistentVolume Access Mode

Stage: Graduating to Beta
Feature group: sig-storage
Feature gate: ReadWriteOncePod Default value: true

With this enhancement, it’s possible to access PersistenVolumes in a ReadWriteOncePod mode, restricting access to a single pod on a single node.

Read more in our “What’s new in Kubernetes 1.22” article.

#3107 Introduce nodeExpandSecret in CSI PV source

Stage: Graduating to Beta
Feature group: sig-storage
Feature gate: CSINodeExpandSecret Default value: true

These new enhancements enable passing the secretRef field to the CSI driver when doing a NodeExpandVolume operation.

Read more in our “What’s new in Kubernetes 1.25” article.

#3141 Prevent unauthorized volume mode conversion during volume restore

Stage: Graduating to Beta
Feature group: sig-storage

This feature adds a layer of security on the VolumeSnapshot feature, which GA’d in Kubernetes 1.20. It prevents the unauthorized conversion of the volume mode during such operation. Although there isn’t a known CVE in the kernel that would allow a malicious user to exploit it, this feature aims to remove that possibility just in case.

In this version, the annotation snapshot.storage.kubernetes.io/allowVolumeModeChange has changed to snapshot.storage.kubernetes.io/allow-volume-mode-change.

Read more in our “What’s new in Kubernetes 1.24” article.

#1710 Speed up recursive SELinux label change

Stage: Graduating to Beta
Feature group: sig-storage
Feature gate: SELinuxMountReadWriteOncePod Default value: true

This feature is meant to speed up the mounting of PersistentVolumes using SELinux. By using the context option at mount time, Kubernetes will apply the security context on the whole volume instead of changing context on the files recursively.

Read more in our “What’s new in Kubernetes 1.25” article.

Other enhancements in Kubernetes 1.27

#2699 KEP for adding webhook hosting capability to the CCM framework

Stage: Net New to Alpha
Feature group: sig-cloud-provider
Feature gate: CloudWebhookServer Default value: false

This enhancement comprises the work to enable cloud providers to replace the PersistentVolumeLabel (PVL) admission controller with a webhook.

The Cloud Controller Manager (CCM) is the binary that Cloud Providers tweak to make a Kubernetes cluster work correctly on their Cloud.

Some operations, like the ones performed by the PLV admission controller, are time sensitive and may require a special setup to ensure performance. In cases like these, some cloud providers would prefer to decouple this code into webhooks.

#2258 Node log query

Stage: Net New to Alpha
Feature group: sig-windows
Feature gate: NodeLogQuery Default value: false

This enhancement adds a kubelet API for viewing logs of systems running on nodes.

For example, to fetch the kubelet logs from a node, you can run:

kubectl get --raw "/api/v1/nodes/node-1/proxy/logs/?query=kubelet"
Code language: Bash (bash)

#1731 Publishing Kubernetes packages on community infrastructure

Stage: Net New to Alpha
Feature group: sig-release

This enhancement details the efforts and ongoing changes made to the release process and package build infrastructure with the end goal of moving from the Google owned package repository to the community managed Open Build Service (OBS) repository.

#3744 Stay on supported go versions

Stage: Graduating to Stable
Feature group: sig-release

Kubernetes is written in go. Within minor versions, Kubernetes tries to not update their version of go to avoid introducing breaking changes. However, this means that by the end of their lifecycle, Kubernetes may be behind the security patches implemented in go.

This enhancement is the setting up of processes and guidelines that will allow the Kubernetes developers to stay up to date with the latest versions of go without introducing breaking changes to us, the users.

If you are interested in the details of these guidelines, head into the KEP.

#3203 Auto-refreshing Official CVE Feed

Stage: Graduating to Beta
Feature group: sig-security

You can now programmatically fetch a feed of CVEs with relevant information about Kubernetes. This is not one enhancement that you’ll enable in your cluster, but one you’ll consume via web resources. You only need to look for the label official-cve-feed among the vulnerability announcements.

Read more in our “What’s new in Kubernetes 1.25” article.

#1610 Container Resource-based Pod Autoscaling

Stage: Graduating to Beta
Feature group: sig-autoscaling
Feature gate: HPAContainerMetrics Default value: true

The current Horizontal Pod Autoscaler (HPA) can scale workloads based on the resources used by their Pods. This is the aggregated usage from all of the containers in a Pod.

This feature allows the HPA to scale those workloads based on the resource usage of individual containers.

Read more in our “What’s new in Kubernetes 1.20” article.


That’s all for Kubernetes 1.27, folks! Exciting as always. Get ready to upgrade your clusters if you are intending to use any of these features.

If you liked this, you might want to check out our previous ‘What’s new in Kubernetes’ editions:

Get involved with the Kubernetes project:

And if you enjoy keeping up to date with the Kubernetes ecosystem, subscribe to our container newsletter, a monthly email with the coolest stuff happening in the cloud-native ecosystem.

Sixth annual Sysdig Container Usage report
Sysdig 2023 Cloud-Native Security and Usage Report Download the full report

Subscribe and get the latest updates