Daniel Gonzalez for Playtomic

Posted on Feb 4

Migrating your cluster to EKS Auto Mode? What You Need to Know Before Taking the Leap

#eks #kubernetes #aws

In December 2024, AWS introduced Amazon EKS Auto Mode, a feature designed to simplify Kubernetes cluster management by automating infrastructure provisioning and scaling. We decided to enable it on our existing cluster to test its capabilities, and here’s what we learned.

TL;DR: If you’re already running an EKS cluster with Karpenter and AWS Load Balancer controller, we don’t recommend migrating to EKS Auto Mode. Keep reading to understand why.

Instead of creating a new cluster, we enabled EKS Auto Mode on our development cluster. While activating the feature was straightforward, utilizing its capabilities required migrating certain resources. In this article, we’ll walk you through the steps we took to enable EKS Auto Mode and share our thoughts after using it for a few weeks.

Let's start by explaining what EKS Auto Mode provides:

Node provisioning, node consolidation, and autoscaling via Karpenter,
Load balancers provisioning through AWS Load Balancer Controller
Storage provisioning via Amazon Elastic Block Store (EBS) CSI driver
Networking and Identity and Access Management come pre-configured in the managed nodes, eliminating the need for cluster add-ons.

If you're already using these features, you'll need to modify some specifications in your Kubernetes resources.

AWS Load Balancer Controller Migration

Before using Karpenter-managed nodes, migrating the AWS Load Balancer Controller is essential. This is because Instance Metadata Service Version 2 (IMDSv2) isn’t available on these nodes. They are configured with HTTPPutResponseHopLimit = 1, which limits the number of hops the metadata token can travel. This setting cannot be changed, meaning our pods couldn’t use this feature. If you’re using the Datadog agent, this might also impact you.

Additionally, existing load balancers can’t be transferred to the Auto Mode controller. AWS recommends duplicating them and performing a blue-green deployment. If you’re using DNS records, this change is really simple.

Karpenter Migration

Karpenter’s configuration remains mostly unchanged, except for the introduction of a new Custom Resource Definition (CRD) called NodeClass. In the case of Karpenter, it uses the EC2NodeClass object. While AWS provides some documentation, it’s not as comprehensive as Karpenter’s official resources.

While EKS Auto Mode can automatically create default node pools, we opted to define our own for greater customization. Here's the NodeClass we created:

apiVersion: eks.amazonaws.com/v1
kind: NodeClass
metadata:
  name: pool-default
spec:
  ephemeralStorage:
    iops: 3000
    size: 50Gi
    throughput: 125
  networkPolicy: DefaultAllow
  networkPolicyEventLogs: Disabled
  role: AmazonEKSAutoNodeRole
  securityGroupSelectorTerms:
  - id: sg-xxxxxxxxxxxxxxxxxx
  snatPolicy: Random
  subnetSelectorTerms:
  - id: subnet-xxxxxxxx
  - id: subnet-xxxxxxxx
  - id: subnet-xxxxxxxx

Next, we configured the following NodePool:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: pool-default
spec:
  disruption:
    budgets:
    - nodes: 10%
    consolidationPolicy: WhenUnderutilized
  template:
    metadata:
      labels:
        node_type: worker
    spec:
      expireAfter: 480h
      nodeClassRef:
        group: eks.amazonaws.com
        kind: NodeClass
        name: pool-default
      requirements:
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - spot
        - on-demand
      # CPU = 4 CPUs
      - key: eks.amazonaws.com/instance-cpu
        operator: Gt
        values:
        - "3"
      - key: eks.amazonaws.com/instance-cpu
        operator: Lt
        values:
        - "5"
      # Memory = 16 GiB
      - key: eks.amazonaws.com/instance-memory
        operator: Gt
        values:
        - "16383"
      - key: eks.amazonaws.com/instance-memory
        operator: Lt
        values:
        - "16385"
      - key: eks.amazonaws.com/instance-hypervisor
        operator: In
        values:
        - nitro
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      terminationGracePeriod: 24h0m0s

During the initial setup, we encountered an issue where the node pool wasn’t reaching a ready state:

After running a describe command on the object, we discovered that dependencies weren’t ready:

While we eventually resolved the issue (a misconfiguration in the Node Class role), I’m sure Karpenter’s logs have the exact details about what went wrong, which would have helped us resolve the problem instantly.

Conclusion

For those setting up a new cluster, EKS Auto Mode is an excellent option. It offers a wide range of features with minimal configuration effort.

However, if your cluster has:

Controllers already configured and running
Infrastructure as Code (IaC) management in place
CI/CD pipelines for deploying changes

Migrating to EKS Auto Mode might not be worth the effort. While Auto Mode can save initial setup time, it offers minimal advantages once you've already established these capabilities.

In fact, you might lose some visibility. Currently, logs from Karpenter and other controllers are not accessible in EKS Auto Mode. Adding CloudWatch integration for these logs—similar to Control Plane logs—would be a valuable improvement.

As AWS enhances EKS Auto Mode with additional features and better tool integration, it may become an attractive option for more use cases. For now, it's best suited for new clusters, while existing setups should carefully weigh the benefits against migration efforts.

Call to Action

Have you tried EKS Auto Mode? What are your thoughts? Share your experiences or questions in the comments below!

DEV Community

Migrating your cluster to EKS Auto Mode? What You Need to Know Before Taking the Leap

AWS Load Balancer Controller Migration

Karpenter Migration

Conclusion

Call to Action

Top comments (0)

Read next

AWS CLF-C02 vs. Azure AZ-900: Which is Best for Beginners? 🔥🚀

Chrome for Developers: Thorsten Hans talks with Thomas Steiner

AWS Lambda RIC - Runtime interface Client

Terraform Implicit vs Explicit Dependencies