DEV Community

Mazen Ramadan
Mazen Ramadan

Posted on

How to implement a Load Balancer Using Nginx & Docker

Scaling becomes a necessary part for your system when your system grows in popularity.
There are two types of scaling:
Vertical Scaling - Adding more resources (CPU, RAM, storage) to your single server.
Horizontal Scaling - Spinning up more servers and distribute the load between them, which is called load balancing.

In this article, we will talk about load balancing, what it is and how to implement a load balancer using Nginx and Docker.
Let's get started!

What is Load Balancing ?

Load balancing is the process of distributing incoming network traffic across multiple servers, applications, or network resources to ensure that no single resource is overwhelmed with too much traffic. The goal of load balancing is to optimize resource usage, maximize throughput, minimize response time, and ensure high availability and reliability of the system.

In a typical load balancing scenario, incoming requests are first directed to a load balancer, which acts as a front-end for the system. The load balancer then forwards the requests to the appropriate server, application, or resource based on a set of predefined rules, policies, or algorithms. This ensures that all resources are used efficiently, and no single resource is overburdened. Load balancers are effective at:

  • Preventing requests from going to unhealthy servers.
  • Preventing overloading resources.
  • Helping to eliminate a single point of failure.

load balancer diagram

There are several different types of load balancing that can be used, depending on the specific requirements and constraints of the system:

1. Round-robin

This is the simplest and most common type of load balancing, where incoming requests are distributed to servers in a cyclical order. Each server is given an equal chance to process a request, regardless of its current load or performance.



servers = [server1, server2, server3]
next_server_index = 0

def handle_request(request):
    server = servers[next_server_index]
    next_server_index = (next_server_index + 1) % len(servers)
    server.process_request(request)


Enter fullscreen mode Exit fullscreen mode

2. Weighted round-robin

This is a variant of round-robin load balancing, where each server is assigned a weight based on its processing capacity, and requests are distributed accordingly. Servers with higher weights are given a higher proportion of incoming traffic.



servers = [{'server': server1, 'weight': 2},
           {'server': server2, 'weight': 3},
           {'server': server3, 'weight': 1}]
next_server_index = 0

def handle_request(request):
    server_info = servers[next_server_index]
    server = server_info['server']
    server.process_request(request)
    server_info['weight'] -= 1
    if server_info['weight'] == 0:
        next_server_index = (next_server_index + 1) % len(servers)
        for info in servers:
            info['weight'] = info['weight'] + 1



Enter fullscreen mode Exit fullscreen mode

3. Least connections

In this type of load balancing, incoming requests are sent to the server with the fewest active connections, in order to avoid overloading any one server. This is particularly useful for applications that involve long-lived connections, such as streaming media or gaming.



servers = [server1, server2, server3]

def handle_request(request):
    min_connections = float('inf')
    min_server = None
    for server in servers:
        if server.connections < min_connections:
            min_connections = server.connections
            min_server = server
    min_server.process_request(request)



Enter fullscreen mode Exit fullscreen mode

4. IP hash load balancing

This method uses the IP address of the client to determine which server to send the request to. The IP address is hashed, and the resulting value is used to select the server. This ensures that requests from the same client are always sent to the same server, which can be useful for maintaining session state or other application-specific requirements.



servers = [server1, server2, server3]

def handle_request(request):
    client_ip = request.get_client_ip()
    hashed_ip = hash(client_ip)
    server_index = hashed_ip % len(servers)
    server = servers[server_index]
    server.process_request(request)



Enter fullscreen mode Exit fullscreen mode

5. Layer 4 load balancing

This type of load balancing operates at the transport layer (TCP/UDP) of the network stack and uses information such as source IP, destination IP, source port, and destination port to distribute traffic across servers.



servers = [server1, server2, server3]

def handle_request(request):
    dest_ip = request.get_dest_ip()
    dest_port = request.get_dest_port()
    server_index = hash(dest_ip + ':' + dest_port) % len(servers)
    server = servers[server_index]
    server.process_request(request)



Enter fullscreen mode Exit fullscreen mode

6. Layer 7 load balancing

This type of load balancing operates at the application layer of the network stack and uses information such as HTTP headers, cookies, and URL paths to distribute traffic across servers. This allows for more intelligent routing based on application-specific criteria, such as session affinity, content-based routing, or SSL offloading.



servers = [{'server': server1, 'route': '/api/*'},
           {'server': server2, 'route': '/auth/*'},
           {'server': server3, 'route': '/*'}]

def handle_request(request):
    for info in servers:
        if request.matches_route(info['route']):
            server = info['server']
            server.process_request(request)
            break



Enter fullscreen mode Exit fullscreen mode

After we know about load balancing, let's now to move the practical guide.

Implement Nginx Load Balancer

Let's consider having 3 Python servers each one is deployed to a container. After that we will use Nginx as a load balancer for the 3 servers.

Here is our files structure:



nginx-load-balancer
    |
    |---app1
    |     |-- app1.py
    |     |-- Dockerfile
    |     |-- requirements.txt
    |
    |---app2
    |     |-- app2.py
    |     |-- Dockerfile
    |     |-- requirements.txt
    |
    |---nginx
    |     |-- nginx.conf
    |     |-- Dockerfile
    |
    |------ docker-compose.yml


Enter fullscreen mode Exit fullscreen mode

We have 2 basic Flask servers which returns a text to clarify which server we are connected to:

app1/app1.py



from flask import request, Flask

app1 = Flask(__name__)


@app1.route('/')
def hello_world():
    return '<h1>Hello from server 1</h2>'


if __name__ == '__main__':
   app1.run(debug=True, host='0.0.0.0')


Enter fullscreen mode Exit fullscreen mode

Each Flask server is deployed to a docker container:

app1/Dockerfile



FROM python:3

COPY ./requirements.txt /requirements.txt

WORKDIR /

RUN pip install -r requirements.txt

COPY . /

ENTRYPOINT [ "python3" ]

CMD [ "app1.py" ]


Enter fullscreen mode Exit fullscreen mode

In the previous code, we tell docker to install Python3 image and copy the requirements file. After that, we change work directory of docker and install dependencies. Finally, we copy the code and run the server.

We will do the same for Nginx:

nginx/nginx.conf



upstream loadbalancer {
    server 172.17.0.1:5001 weight=5;
    server 172.17.0.1:5002 weight=5;
}

server {
    location / {
        proxy_pass http://loadbalancer;
    }
}


Enter fullscreen mode Exit fullscreen mode

Here we use round-robin balancing algorithm. The routing criteria is specified using the weight parameter. In this case, we want to balance the load equally for both servers.

After that, let's dockerize Nginx configruation. It will copy the above conf file on the related path inside the container when starting it.

nginx/Dockerfile



FROM nginx
RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d/default.conf


Enter fullscreen mode Exit fullscreen mode

Finally let's create a docker-compose file, it will be a wrapper that spins up the whole architecture:

docker-compose.yml



version: '3'
services:
  app1:
    build: ./app1
    ports:
    - "5001:5000"
  app2:
    build: ./app2
    ports:
    - "5002:5000"
  nginx:
    build: ./nginx 
    ports:
    - "8080:80"
    depends_on:
      - app1
      - app2


Enter fullscreen mode Exit fullscreen mode

Here is what this file do:

  • It will build images for app1, app2, Nginx based on our Dockerfiles and then spin up containers from those images.

  • The opened port inside app1 and app2 containers are 5000 (default port used by flask), these ports will be mapped to 5001 and 5002.

  • The load balancer will route traffic to the appropriate application based on that port.

  • The load balancer (Nginx) will expose his internal 80 port to 8080, so we can access the application from http://localhost:8080

Finally, you can spin up the architecture using this command from the base directory:



docker-compose up


Enter fullscreen mode Exit fullscreen mode

Requests well be routed to both servers equally:

server1

server2

You can find the code used here.

Top comments (0)