How to setup simple load balancing with IPVS, demo with docker.

#k8s #ipvs #network #linux

A few days ago, I was reading about the Kubernetes network model, especially about services and the kube-proxy component, and I discovered that kube-proxy has three modes, which are userspace, iptables and ipvs.

The userspace mode is too old and slow, nowaday nobody recommends to use it, the iptables mode is the default mode for kube-proxy with this mode kube-proxy use iptables rules to forward packets that are destined for services to a backend for that services, and the last one is ipvs I did not know what it was so I read about it.

What is IPVS?

IPVS (IP Virtual Server) is built on top of the Netfilter and implements transport-layer load balancing as part of the Linux kernel.

IPVS is incorporated into the LVS (Linux Virtual Server), where it runs on a host and acts as a load balancer in front of a cluster of real servers. IPVS can direct requests for TCP- and UDP-based services to the real servers, and make services of the real servers appear as virtual services on a single IP address.

That means IPVS is a Linux kernel load balancer over layer 4, if you don't know what is the difference between LB L4 and LB L7 there is a good explanation Here.

LB L4

At Layer 4, a load balancer has visibility on network information such as application ports and protocol (TCP/UDP). The load balancer delivers traffic by combining this limited network information with a load balancing algorithm such as round-robin and by calculating the best destination server based on least connections or server response times.

So in this layer, you are not parsing the data in the packages, so you don't know what's inside, for instance, if you are using LB 4 and receive an HTTP request at this layer you can't see the path or the body o headers of this request so you cant take a smart decision based on this.

LB L7

At Layer 7, a load balancer has application awareness and can use this additional application information to make more complex and informed load balancing decisions. With a protocol such as HTTP, a load balancer can uniquely identify client sessions based on cookies and use this information to deliver all a clients requests to the same server. This server persistence using cookies can be based on the server’s cookie or by active cookie injection where a load balancer cookie is inserted into the connection. Free LoadMaster includes cookie injection as one of many methods of ensuring session persistence.

Simple Demo

Now you know what is IPVS, we are going to make an ultra-simple demo using docker to have an LB with IPVS between two containers.

The first thing that we need is the CLI tool for interacting with the IP virtual server table in the kernel.

ipvsadm - Linux Virtual Server administration ipvsadm.

sudo apt-get install -y ipvsadm

Create the virtual service.

Now, we can use the CLI to create a new virtual service:

ipvsadm COMMAND [protocol] service-address [scheduling-method] [persistence options]

If we see in the documentation of ipvsadm, we will see that using the flag -A we indicated "Add a virtual service", and the flag -s is for the scheduling-method, first we will try with rr that means Round Robin, we have different options such as: wrr - Weighted Round Robin, lc - Least-Connection, lblc - Locality-Based Least-Connection and more.

So we are creating the virtual service for the address 100.100.100.100:80 using Round Robin as scheduling-method.

sudo ipvsadm -A -t 100.100.100.100:80 -s rr

Create two docker container

We are going to use the image jwilder/whoami for our containers, this image just returns the container's id.

$ docker run -d -p 8000:8000 --name first -t jwilder/whoami
cd977829ae0c76236a1506c497d5ce1628f1f701f8ed074916b21fc286f3d0d1

$ docker run -d -p 8001:8000 --name second -t jwilder/whoami
5886b1ed7bd4095cb02b32d1642866095e6f4ce1750276bd9fc07e91e2fbc668

Then, we are going to get IP of these containers, using docker inspect

$ docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' first
172.17.0.2

$ docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' second
172.17.0.3

Using curl to one of these containers, we will see the container_id.

$ curl 172.17.0.2:8000
I'm cd977829ae0c

Add the IPs to the virtual service.

We have the containers' IP, so we are going to add these IP to the virtual service using ipvsadm with the flags -a to add a server to the virtual service that we specified using -t and -m to use masquerading (network access translation, or NAT).

$ sudo ipvsadm -a -t 100.100.100.100:80 -r 172.17.0.2:8000 -m
$ sudo ipvsadm -a -t 100.100.100.100:80 -r 172.17.0.3:8000 -m

We can use the ipvsadm to list the virtual services with its servers.

$ ipvsadm -l
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  100.100.100.100:http rr
  -> 172.17.0.2:8000              Masq    1      0          0
  -> 172.17.0.3:8000              Masq    1      0          0

If you added a wrong server, you can remove that server with -d flag.

$ ipvsadm -d -t 100.100.100.100:http -r 172.17.0.3:8000

Now our service can make loadbalacing over L4 using Round Robin as algorithm to balance.

$ curl 100.100.100.100
I'm 5886b1ed7bd4

$ curl 100.100.100.100
I'm cd977829ae0c

$ curl 100.100.100.100
I'm 5886b1ed7bd4

$ curl 100.100.100.100
I'm cd977829ae0c

As you can see, doing load balancing with IPVS is pretty straightforward.

In K8S one of the advantages of choosing IPVS mode for kube-proxy instead of iptables is that IPVS is a Linux kernel feature that is designed for load balancing, it has multiple different scheduling algorithms such as round-robin, shortest-expected-delay, least connections and more, also it has an optimized look-up routine O(1) based on a hash table data structure rather than a list of sequential rules O(n) "iptables adds the rules in a sequential chain that grows roughly in proportion to the number of services and number of backend pods behind each service)."