Scaling becomes a necessary part for your system when your system grows in popularity.
There are two types of scaling:
Vertical Scaling - Adding more resources (CPU, RAM, storage) to your single server.
Horizontal Scaling - Spinning up more servers and distribute the load between them, which is called load balancing.
In this article, we will talk about load balancing, what it is and how to implement a load balancer using Nginx and Docker.
Let's get started!
What is Load Balancing ?
Load balancing is the process of distributing incoming network traffic across multiple servers, applications, or network resources to ensure that no single resource is overwhelmed with too much traffic. The goal of load balancing is to optimize resource usage, maximize throughput, minimize response time, and ensure high availability and reliability of the system.
In a typical load balancing scenario, incoming requests are first directed to a load balancer, which acts as a front-end for the system. The load balancer then forwards the requests to the appropriate server, application, or resource based on a set of predefined rules, policies, or algorithms. This ensures that all resources are used efficiently, and no single resource is overburdened. Load balancers are effective at:
- Preventing requests from going to unhealthy servers.
- Preventing overloading resources.
- Helping to eliminate a single point of failure.
There are several different types of load balancing that can be used, depending on the specific requirements and constraints of the system:
1. Round-robin
This is the simplest and most common type of load balancing, where incoming requests are distributed to servers in a cyclical order. Each server is given an equal chance to process a request, regardless of its current load or performance.
servers = [server1, server2, server3]
next_server_index = 0
def handle_request(request):
server = servers[next_server_index]
next_server_index = (next_server_index + 1) % len(servers)
server.process_request(request)
2. Weighted round-robin
This is a variant of round-robin load balancing, where each server is assigned a weight based on its processing capacity, and requests are distributed accordingly. Servers with higher weights are given a higher proportion of incoming traffic.
servers = [{'server': server1, 'weight': 2},
{'server': server2, 'weight': 3},
{'server': server3, 'weight': 1}]
next_server_index = 0
def handle_request(request):
server_info = servers[next_server_index]
server = server_info['server']
server.process_request(request)
server_info['weight'] -= 1
if server_info['weight'] == 0:
next_server_index = (next_server_index + 1) % len(servers)
for info in servers:
info['weight'] = info['weight'] + 1
3. Least connections
In this type of load balancing, incoming requests are sent to the server with the fewest active connections, in order to avoid overloading any one server. This is particularly useful for applications that involve long-lived connections, such as streaming media or gaming.
servers = [server1, server2, server3]
def handle_request(request):
min_connections = float('inf')
min_server = None
for server in servers:
if server.connections < min_connections:
min_connections = server.connections
min_server = server
min_server.process_request(request)
4. IP hash load balancing
This method uses the IP address of the client to determine which server to send the request to. The IP address is hashed, and the resulting value is used to select the server. This ensures that requests from the same client are always sent to the same server, which can be useful for maintaining session state or other application-specific requirements.
servers = [server1, server2, server3]
def handle_request(request):
client_ip = request.get_client_ip()
hashed_ip = hash(client_ip)
server_index = hashed_ip % len(servers)
server = servers[server_index]
server.process_request(request)
5. Layer 4 load balancing
This type of load balancing operates at the transport layer (TCP/UDP) of the network stack and uses information such as source IP, destination IP, source port, and destination port to distribute traffic across servers.
servers = [server1, server2, server3]
def handle_request(request):
dest_ip = request.get_dest_ip()
dest_port = request.get_dest_port()
server_index = hash(dest_ip + ':' + dest_port) % len(servers)
server = servers[server_index]
server.process_request(request)
6. Layer 7 load balancing
This type of load balancing operates at the application layer of the network stack and uses information such as HTTP headers, cookies, and URL paths to distribute traffic across servers. This allows for more intelligent routing based on application-specific criteria, such as session affinity, content-based routing, or SSL offloading.
servers = [{'server': server1, 'route': '/api/*'},
{'server': server2, 'route': '/auth/*'},
{'server': server3, 'route': '/*'}]
def handle_request(request):
for info in servers:
if request.matches_route(info['route']):
server = info['server']
server.process_request(request)
break
After we know about load balancing, let's now to move the practical guide.
Implement Nginx Load Balancer
Let's consider having 3 Python servers each one is deployed to a container. After that we will use Nginx as a load balancer for the 3 servers.
Here is our files structure:
nginx-load-balancer
|
|---app1
| |-- app1.py
| |-- Dockerfile
| |-- requirements.txt
|
|---app2
| |-- app2.py
| |-- Dockerfile
| |-- requirements.txt
|
|---nginx
| |-- nginx.conf
| |-- Dockerfile
|
|------ docker-compose.yml
We have 2 basic Flask servers which returns a text to clarify which server we are connected to:
app1/app1.py
from flask import request, Flask
app1 = Flask(__name__)
@app1.route('/')
def hello_world():
return '<h1>Hello from server 1</h2>'
if __name__ == '__main__':
app1.run(debug=True, host='0.0.0.0')
Each Flask server is deployed to a docker container:
app1/Dockerfile
FROM python:3
COPY ./requirements.txt /requirements.txt
WORKDIR /
RUN pip install -r requirements.txt
COPY . /
ENTRYPOINT [ "python3" ]
CMD [ "app1.py" ]
In the previous code, we tell docker to install Python3 image and copy the requirements file. After that, we change work directory of docker and install dependencies. Finally, we copy the code and run the server.
We will do the same for Nginx:
nginx/nginx.conf
upstream loadbalancer {
server 172.17.0.1:5001 weight=5;
server 172.17.0.1:5002 weight=5;
}
server {
location / {
proxy_pass http://loadbalancer;
}
}
Here we use round-robin balancing algorithm. The routing criteria is specified using the weight parameter. In this case, we want to balance the load equally for both servers.
After that, let's dockerize Nginx configruation. It will copy the above conf file on the related path inside the container when starting it.
nginx/Dockerfile
FROM nginx
RUN rm /etc/nginx/conf.d/default.conf
COPY nginx.conf /etc/nginx/conf.d/default.conf
Finally let's create a docker-compose file, it will be a wrapper that spins up the whole architecture:
docker-compose.yml
version: '3'
services:
app1:
build: ./app1
ports:
- "5001:5000"
app2:
build: ./app2
ports:
- "5002:5000"
nginx:
build: ./nginx
ports:
- "8080:80"
depends_on:
- app1
- app2
Here is what this file do:
It will build images for app1, app2, Nginx based on our Dockerfiles and then spin up containers from those images.
The opened port inside app1 and app2 containers are 5000 (default port used by flask), these ports will be mapped to 5001 and 5002.
The load balancer will route traffic to the appropriate application based on that port.
The load balancer (Nginx) will expose his internal 80 port to 8080, so we can access the application from http://localhost:8080
Finally, you can spin up the architecture using this command from the base directory:
docker-compose up
Requests well be routed to both servers equally:
You can find the code used here.
Top comments (0)