Many production Kubernetes clusters blend on-demand (higher-SLA) and spot/preemptible (lower-SLA) nodes to optimize costs while maintaining reliability for critical workloads. Platform teams need a safe default that keeps most workloads away from risky capacity, while allowing specific workloads to opt-in with explicit thresholds like "I can tolerate nodes with failure probability up to 5%". Today, Kubernetes taints and tolerations can match exact values or check for existence, but they can’t compare numeric thresholds. You’d need to create discrete taint categories, use external admission controllers, or accept less-than-optimal placement decisions. In Kubernetes v1.35, we’re introducing Extended Toleration Operators as an alpha feature. This enhancement adds Gt (Greater Than) and Lt (Less Than) operators to spec.tolerations, enabling threshold-based scheduling decisions that unlock new possibilities for SLA-based placement, cost optimization, and performance-aware workload distribution. The evolution of tolerations Historically, Kubernetes supported two primary toleration operators: Equal: The toleration matches a taint if the key and value are exactly equal Exists: The toleration matches a taint if the key exists, regardless of value While these worked well for categorical scenarios, they fell short for numeric comparisons. Starting with v1.35, we are closing this gap. Consider these real-world scenarios: SLA requirements: Schedule high-availability workloads only on nodes with failure probability below a certain threshold Cost optimization: Allow cost-sensitive batch jobs to run on cheaper nodes that exceed a specific cost-per-hour value Performance guarantees: Ensure latency-sensitive applications run only on nodes with disk IOPS or network bandwidth above minimum thresholds Without numeric comparison operators, cluster operators…
Want more insights? Join Grow With Caliber - our career elevating newsletter and get our take on the future of work delivered weekly.