Beyond Kubernetes: Why Some Applications Are Better Off Without It

Kubernetes (k8s) has become the gold standard for container orchestration, celebrated for its ability to manage modern microservices architectures with agility and resilience. However, when applied to applications that aren’t cloud-native, the value proposition becomes less clear. In some cases, Kubernetes can introduce unnecessary complexity and cost, especially when simpler solutions like virtual machines (VMs) or alternative orchestrators might be more effective. This article explores when Kubernetes makes sense, when it doesn’t, and why tools like Nomad may sometimes be a better fit.

Kubernetes and the Challenge of “Non-Native” Applications
Elasticity vs. Scalability: The Core Mismatch
Kubernetes thrives in scenarios requiring elasticity — the ability to dynamically adjust resources based on demand. Some applications, however, even if they are modern, are not designed to fully leverage Kubernetes’ features. These non-cloud-native applications may rely on scalability models that involve vertical scaling (adding more CPU or memory to a single instance) rather than horizontal scaling (adding more instances).

While elasticity can be mathematically framed as the system’s ability to converge resource usage with demand in real time, some applications cannot adapt to this model. For example:

A horizontally scalable Kubernetes application distributes load across multiple lightweight instances.
A vertically scalable application demands increased resources within a single instance, violating Kubernetes’ design principles.
This mismatch often leads to inefficiencies, higher costs, and operational challenges when deploying such apps on Kubernetes.

Requests vs. Limits: Balancing Act

Kubernetes schedules workloads based on requests (minimum guaranteed resources) but enforces limits (maximum allowable resources). This distinction becomes critical:

If an application exceeds its CPU limit, performance degrades (due to throttling).
If it exceeds its memory limit, Kubernetes terminates the pod (OOMKilled).
Some non-cloud-native applications, even if stateless, often have unpredictable spikes in resource usage, making it difficult to find a balance:

Setting high limits leads to resource overcommitment, increasing costs.
Setting low limits risks frequent crashes during usage peaks.
The dilemma can be modeled as:

Difference between limit and average usage = limit — usage_avg

If this difference is too small, the application crashes during peaks. If it’s too large, resources are wasted. Managing this trade-off is especially challenging in workloads with irregular demand.

Mathematical Modeling of the “Requests vs. Limits” Balancing Act
To analyze the challenge of balancing requests and limits in Kubernetes, let’s break it down:

1. The Problem with High Limits

When the limit (maximum resource usage) is set much higher than the application’s average usage, it results in wasted resources. For example:

The difference between the limit and the average usage represents unused resources.
While this prevents crashes during spikes, it leads to low efficiency because you’re over-provisioning.
In simple terms:

Efficiency = (Average Usage) / (Provisioned Limit).

When the limit is too high, efficiency drops significantly.

2. The Problem with Low Limits

When the limit is set too close to the average usage, it risks exceeding the limit during demand spikes. For CPU, this leads to throttling (performance degradation), and for memory, Kubernetes terminates the pod (OOMKilled).

The probability of a crash increases as the limit gets closer to the peak usage. For workloads with unpredictable spikes, this probability becomes harder to estimate, making it challenging to set an optimal value.

3. The Balance Between Wasted Resources and Crash Risk
The ideal limit should balance:

Minimizing wasted resources (difference between the limit and the average usage).
Keeping the crash probability low (difference between the limit and peak usage).
For workloads with steady demand, this balance is easier to achieve. However, for workloads with irregular spikes, setting a fixed limit is particularly difficult because:

Increasing variability in resource usage makes crashes more likely for a given limit.
Over-provisioning to prevent crashes leads to significant inefficiency.

4. Workloads with High Variability

In highly variable workloads, dynamic tools like Kubernetes’ Horizontal Pod Autoscaler (HPA) or Vertical Pod Autoscaler (VPA) are often required. These tools adjust resource limits in real time but add complexity:

They can sometimes create feedback loops (oscillations) between scaling decisions.
Balancing both horizontal and vertical scaling efficiently is itself a challenging task.
Monolithic Applications and Other “Non-Fitting” Architectures
The “Distributed Monolith” Trap
Many teams attempt to adapt monolithic applications for Kubernetes by splitting them into smaller components based on functionality. While this can work in theory, it often results in a distributed monolith: a system where components are split but still tightly coupled. This architecture exacerbates problems like:

Unpredictable Load Peaks: Distributed components create random resource spikes, which can align statistically over time, leading to node overloads.
Networking Overhead: Communication between tightly coupled components adds latency and increases the potential for failure.
Cluster Instability: Kubernetes schedulers, designed for stateless workloads, struggle to manage highly stateful or tightly coupled applications.
For example, if several distributed components experience simultaneous load spikes, the cluster’s resource distribution may fail:

Sum of resources: Sn = Σ (Xi), where Xi ∼ N(μ, σ²)

As the number of components (n) increases, the variance (σ²) grows, amplifying the risk of resource contention and crashes.

Modern Applications That Don’t Fit
Even some modern, stateless microservices may struggle on Kubernetes if:

They require vertical scaling to handle peak loads.
They generate unpredictable usage patterns that exceed typical autoscaling capabilities.

Their operational model doesn’t align with Kubernetes’ orchestration principles (e.g., high startup times, large container images).
These scenarios highlight that being “modern” doesn’t always equate to being “cloud-native.” Kubernetes is optimized for applications adhering to cloud-native principles, such as those outlined in the Twelve-Factor App. Applications diverging from these principles often encounter operational and cost inefficiencies.

Alternatives to Kubernetes: When VMs or Nomad Make More Sense

When to Stick with VMs

For applications that don’t align with Kubernetes’ core strengths, VMs remain a reliable choice:

Simpler to manage for stateful, monolithic, or vertically scalable workloads.

Better suited for predictable workloads with steady demand.
Avoid the overhead of adapting applications to Kubernetes’ ecosystem (e.g., StatefulSets, PersistentVolumes).

Nomad: A Simpler Orchestrator for Non-Native Workloads
HashiCorp Nomad is an alternative orchestrator designed to handle mixed workloads, including containers, VMs, and even standalone binaries. It offers several advantages:

Native VM Support: Unlike Kubernetes, Nomad can manage VMs directly without requiring containers.
Simplified Operation: Nomad’s architecture is more straightforward, making it easier to set up and manage for applications that don’t fit the cloud-native mold.
Lower Overhead: Nomad eliminates the need for the complex ecosystem of Kubernetes (e.g., etcd, CNI plugins), reducing operational burden.
While Nomad lacks Kubernetes’ extensive ecosystem, it’s often a better fit for applications that deviate from cloud-native principles.

The Role of Tools like Cast.ai

When teams insist on running non-cloud-native applications on Kubernetes, proprietary tools like Cast.ai can help optimize costs and manage complexity. These tools:

Automatically adjust horizontal and vertical scaling to prevent resource contention.
Resolve conflicts between Kubernetes’ Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA), which often compete and create oscillations in resource allocation.
While effective, relying on proprietary tools adds vendor lock-in and additional costs, which may not align with the original goal of reducing operational overhead.

Final Thoughts: Does Kubernetes Really Make Sense?
Kubernetes is a powerful platform, but it’s not a universal solution. For non-cloud-native applications or workloads better suited to VMs, Kubernetes often introduces unnecessary complexity and cost. Similarly, when simpler orchestration is needed, tools like Nomad can be a more practical choice.

Before migrating an application to Kubernetes, ask yourself:

Does the application need elastic scaling or benefit from containerization?
Are the costs of adapting the application justified by the potential gains?
Would alternative solutions like Nomad or VMs be simpler and more effective?
In many cases, the best way to optimize your infrastructure isn’t forcing Kubernetes into every scenario — it’s choosing the right tool for the job.