Kubernetes is designed to orchestrate workloads efficiently across nodes, ensuring optimal resource utilization and workload reliability. However, when resource constraints arise, Kubernetes must make tough decisions—this is where pod eviction comes in. Understanding how Kubernetes evicts pods helps administrators optimize workload resilience and ensure high availability.
In this article, we will explore Kubernetes pod eviction mechanisms, diving into node-pressure eviction, API-driven eviction, pod priorities, pod preemption, and Quality of Service (QoS) classes. We will also examine how these factors interact to maintain cluster stability.
The Foundation: Quality of Service (QoS)
At the heart of Kubernetes' eviction decisions lies the Quality of Service (QoS) classification system. Every pod in Kubernetes is assigned one of three QoS classes:
- Guaranteed: A pod is assigned this QoS class if all of its containers have precisely defined resource (cpu and memory) requests and limits that are set equal to each other.
- Burstable: This is the middle class of QoS; these pods have defined memory or CPU requests or limits for at least one of their containers.
- BestEffort: These pods have no resource requests or limits defined.
Evictions can occur for multiple reasons, including resource constraints (e.g., memory pressure) and administrative actions (e.g., API-initiated deletions). Kubernetes provides structured mechanisms to handle evictions gracefully. There are two main categories of pod eviction:
Node-pressure eviction: Triggered automatically when a node experiences resource shortages.
API-driven eviction: Initiated by a user or an external controller via the Kubernetes API.
Let’s break down these eviction types and the factors influencing them.
Node-Pressure Eviction: Automatic Resource Management
When a node in the cluster experiences resource pressure—such as low memory or disk space availability—Kubernetes triggers node-pressure eviction. This is a self-defense mechanism to prevent the node from becoming unresponsive or crashing. The kubelet monitors resource usage and, upon reaching critical thresholds, selects pods for eviction based on priority and QoS value.
Configuring Node-Pressure Eviction
Node-pressure eviction can be configured by setting eviction thresholds in the kubelet configuration. Below is an example of how to configure eviction thresholds in the Kubelet configuration file:
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
evictionHard:
memory.available: "500Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "15%"
evictionSoft:
memory.available: "1Gi"
nodefs.available: "15%"
nodefs.inodesFree: "10%"
imagefs.available: "20%"
evictionSoftGracePeriod:
memory.available: "1m30s"
nodefs.available: "1m30s"
nodefs.inodesFree: "1m30s"
imagefs.available: "1m30s"
evictionMaxPodGracePeriod: 600
evictionPressureTransitionPeriod: "5m0s"
Key configuration components:
- Hard Eviction Thresholds: When these are breached, the kubelet will immediately start evicting pods
- Soft Eviction Thresholds: Pods are evicted only if the threshold is exceeded for a specified grace period
- Grace Periods: Define how long the kubelet should wait before starting eviction
- Pressure Transition Period: Defines how long a node condition must persist before triggering eviction under pressure condition
The kubelet monitors these eviction signals for eviction decisions.
Factors Affecting Node-Pressure Eviction
-
Quality of Service (QoS) Class:
- Guaranteed: Highest priority—only evicted in extreme conditions.
- Burstable: Evicted after BestEffort pods but before Guaranteed pods.
- BestEffort: Lowest priority—first to be evicted.
Pod Priority and Preemption: Higher-priority pods are less likely to be evicted, while lower-priority pods are targeted first. Pod priority can be defined by creating a PriorityClass and specifying it in the pod specification (
priorityClassName
).Graceful Termination: The Kubelet allows evicted pods to terminate gracefully based on their configured
terminationGracePeriodSeconds
.
API-Driven Eviction: The Manual Override
Unlike node-pressure evictions, which are automatic, API-driven evictions occur when users or controllers explicitly request pod removal using the Eviction API.
Use Cases for API-Driven Eviction
- Cluster Autoscaler: Scales down nodes by evicting pods before removing the node.
- Controllers (e.g., Deployment, ReplicaSet): May trigger evictions to manage rolling updates.
- Administrative Actions: Operators can manually evict pods to redistribute workloads.
API-driven evictions respect pod disruption budgets (PDBs), ensuring that evictions do not impact availability beyond acceptable thresholds.
Pod Priority and Preemption: The Hierarchy of Importance
Not all pods are created equal. Kubernetes allows users to assign priorities to pods, which influence eviction and scheduling decisions. Preemption priority determines which pods get evicted when a higher-priority pod needs to be scheduled. The Kubernetes scheduler evaluates available nodes and determines if preempting existing pods would create enough resources for the new pod.
- Higher-priority pods preempt lower-priority ones when no sufficient resources are available.
- Pods with the same priority are not preempted; Kubernetes looks for lower-priority alternatives.
- Preemption considers pod disruption budgets (PDBs) to minimize service disruptions.
This ensures that critical workloads always have resources available.
Conclusion
Pod eviction in Kubernetes is a finely tuned process that balances resource availability, workload importance, and user-defined policies. By leveraging node-pressure eviction, API-driven eviction, pod priorities, pod preemption, and QoS classes, Kubernetes ensures that clusters remain stable and efficient, even under pressure. By understanding these mechanisms, you can optimize your workloads, ensure fair resource distribution, and prevent disruptions in your Kubernetes environments.
So, the next time a pod is evicted, remember: it’s not a failure—it’s Kubernetes doing its job, mastering the art of letting go.
Top comments (0)