DEV Community

Cover image for How to Scale the Ingress in Kubernetes (EKS)
CodeWithVed
CodeWithVed

Posted on

How to Scale the Ingress in Kubernetes (EKS)

ingress-nginx

This article will help you understand the challenges you will face and how to scale the ingress. First and foremost, you need to have an EKS cluster where your application is running with ingress deployed in it.

We will be discussing ingress-nginx, not nginx-ingress, which is different and not part of default Kubernetes. Whenever I mention "ingress," I am referring to ingress-nginx.

I wanted to mention that I had an EKS cluster running with various services in it with ingress.

Points I considered while scaling:

  1. Find the purpose and determine whether it is necessary to scale the ingress.

  2. You will need an understanding of how HPA works, or you can use KEDA (Kubernetes Event-Driven Autoscaling).

  3. You will need a better instance to handle the incoming load.

Don't Worry! I will cover all the things in details for you.

First of all, lets cover what is ingress?

In Kubernetes, an "Ingress" is an API object that allows external users to access services running within a cluster. It provides routing rules, defined within the Ingress resource, to configure access to your cluster. Ingress is a declarative load balancer/reverse proxy configuration resource. Furthermore, Another term "Egress" refers to the traffic that leaves a cluster, originating from a pod and destined for an external endpoint outside the cluster. In other words, egress traffic is outgoing network traffic from a pod to a destination outside the Kubernetes cluster.

What is Keda? And How you can use it within your cluster to scale your application?

you can learn more by going here

Why do you need the higher network instance for handling the incoming load?

A higher network bandwidth instance is essential for scaling. For example, on AWS, the c6n.4xlarge instance provides 16 vCPUs, 32 GiB of memory, and 25 Gbps network performance. If you are sending an average of 340 requests per second and your internal instances are scaling but unable to handle the ingress due to busy IOPS, you will receive "Request Failed: Error" messages more frequently. This indicates that you need greater network bandwidth to serve the incoming requests effectively.

Notes: To send multiple requests, I used the K6 script. You can follow the instructions to install it.


Since I was receiving many "Request Failed" errors, I first checked whether the request was reaching my service. If it was, then the issue was not with my service.

To tackle this, I added scaling based on the number of requests. You can follow this method to scale based on the number of requests you are receiving.


However, scaling would not solve the problem. The main issue is that if your ingress scales based on conditions, it will essentially divide the IOPS. For instance, if my Node Group has 25 Gbps and my deployment scales to 2 pods, the network bandwidth will be divided to 12.5 Gbps per pod. This division can become a bottleneck if your application needs to serve a higher number of requests.

That's why I mentioned earlier that you need sufficient bandwidth to serve the requests. Based on your use case, you can choose the appropriate type of Node Group. Additionally, my bandwidth should be applicable only to that deployment. I explicitly deployed it on the required Node group so that my ingress could use the full network bandwidth to transfer the requests further.

Top comments (0)