K8s deployment scaling is the process of adjusting the number of pod replicas in a deployment to meet the application’s changing resource requirements or traffic load.
Scaling can be done in two ways: manually using commands like kubectl scale
, or automatically through features such as Horizontal Pod Autoscaling (HPA)
. HPA adjusts the number of pod replicas based on metrics like CPU or memory usage, automatically responding to resource demand.
This flexibility allows K8s to dynamically allocate resources, ensuring optimal performance, availability, and cost efficiency in environments with fluctuating workloads.
The importance of manual scaling in K8s:
- Immediate Control: It quickly scales up or down based on immediate needs, such as traffic spikes or maintenance.
- Customization: It adjusts replica counts, independent of auto-scaling rules.
- Testing: It helps to simulate different load conditions for stress testing or performance evaluations.
- Quick Recovery: It instantly increases replicas to replace unresponsive pods or mitigate failures.
- Operational Maintenance: It scales down during maintenance and scale back up after.
- Resource Management: It adjusts based on specific resource needs, such as specialized workloads.
- Quota Management: It ensures scaling stays within resource quotas or limits.
- Scheduled Scaling: It provides automated scheduled manual scaling at specific times with cron jobs.
Use Cases for K8s Horizontal Pod Autoscaling (HPA)
E-commerce Traffic Surges: HPA scales pods during flash sales, holiday events, or promotional campaigns to handle high traffic volumes.
Streaming Platforms: HPA automatically scales video encoding/streaming services based on the number of active users.
Financial Applications: It scales trading systems during market opening/closing hours when transaction rates spike.
IoT Data Ingestion: HPA scales up processing services when devices send large amounts of telemetry data.
Gaming Servers: It handles fluctuating player counts by scaling game server pods dynamically.
Web and API Services: HPA scales backend services based on CPU, memory usage, or request rates to maintain response times.
Machine Learning Workloads: HPA scales model training or inference services dynamically based on queue sizes or resource utilization.
Log Aggregation Systems: HPA adjusts pod counts for tools like ELK/EFK stacks based on log ingestion rates.
Hands-on Samples
I've implemented two hands-on samples:
- Hands-on Sample #1 for Manual Scaling
- Hands-on Sample #2 for Auto Scaling with HPA
Hands-on Sample #1 for Manual Scaling
This scenario shows:
- how to create deployment,
- how to scale up/down of deployment manually with
scale deployment
, - how to connect to the one of the pods with bash,
- how to show ethernet interfaces of the pod and ping other pods,
Steps
- Run minikube:
omer@k8s:$ minikube start
😄 minikube v1.35.0 on Ubuntu 20.04
✨ Automatically selected the docker driver
📌 Using Docker driver with root privileges
👍 Starting "minikube" primary control-plane node in "minikube" cluster
🚜 Pulling base image v0.0.46 ...
🔥 Creating docker container (CPUs=2, Memory=3100MB) ...
❗ Failing to connect to https://registry.k8s.io/ from both inside the minikube container and host machine
💡 To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/
🐳 Preparing Kubernetes v1.32.0 on Docker 27.4.1 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
▪ Configuring RBAC rules ...
🔗 Configuring bridge CNI (Container Networking Interface) ...
🔎 Verifying Kubernetes components...
▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟 Enabled addons: storage-provisioner, default-storageclass
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
- Create Yaml file (deployment1.yaml) in your directory and copy the below definition into the file.
- File: https://github.com/omerbsezer/Fast-Kubernetes/blob/main/labs/deployment/deployment1.yaml
YAML File Explanation:
-
selector:
=> deployment selector -
matchLabels:
=> deployment selects "app:frontend" pods, monitors and traces these pods -
app: frontend
=> if one of the pod is killed, K8s looks at the desire state (replica:3), it recreats another pods to protect number of replicas -
labels:
=> pod labels, if the deployment selector is same with these labels, deployment follows pods that have these labels -
app: frontend
=> key: value -
image: nginx:latest
=> image download from DockerHub -
containerPort: 80
=> open following ports
apiVersion: apps/v1
kind: Deployment
metadata:
name: firstdeployment
labels:
team: development
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
- Create deployment and list the deployment's pods:
omer@k8s:$ kubectl apply -f deployment1.yaml
deployment.apps/firstdeployment created
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
firstdeployment-54758c4c55-dh98p 0/1 ContainerCreating 0 9s <none> minikube <none> <none>
firstdeployment-54758c4c55-pnz5c 0/1 ContainerCreating 0 9s <none> minikube <none> <none>
firstdeployment-54758c4c55-zbn7t 0/1 ContainerCreating 0 9s <none> minikube <none> <none>
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
firstdeployment-54758c4c55-dh98p 1/1 Running 0 19s 10.244.0.5 minikube <none> <none>
firstdeployment-54758c4c55-pnz5c 1/1 Running 0 19s 10.244.0.4 minikube <none> <none>
firstdeployment-54758c4c55-zbn7t 1/1 Running 0 19s 10.244.0.3 minikube <none> <none>
- Delete one of the pod (e.g.
-dh98p
), then K8s automatically creates new pod (-jp69b
):
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
firstdeployment-54758c4c55-dh98p 1/1 Running 0 19s 10.244.0.5 minikube <none> <none>
firstdeployment-54758c4c55-pnz5c 1/1 Running 0 19s 10.244.0.4 minikube <none> <none>
firstdeployment-54758c4c55-zbn7t 1/1 Running 0 19s 10.244.0.3 minikube <none> <none>
omer@k8s:$ kubectl delete pod firstdeployment-54758c4c55-dh98p
pod "firstdeployment-54758c4c55-dh98p" deleted
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
firstdeployment-54758c4c55-jp69b 1/1 Running 0 3s 10.244.0.6 minikube <none> <none>
firstdeployment-54758c4c55-pnz5c 1/1 Running 0 88s 10.244.0.4 minikube <none> <none>
firstdeployment-54758c4c55-zbn7t 1/1 Running 0 88s 10.244.0.3 minikube <none> <none>
- Scale up to 7 replicas:
omer@k8s:$ kubectl scale deployments firstdeployment --replicas=7
deployment.apps/firstdeployment scaled
omer@k8s:$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
firstdeployment 7/7 7 7 5m35s
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
firstdeployment-54758c4c55-8q2pz 1/1 Running 0 16s 10.244.0.9 minikube <none> <none>
firstdeployment-54758c4c55-d4lqh 1/1 Running 0 16s 10.244.0.8 minikube <none> <none>
firstdeployment-54758c4c55-jp69b 1/1 Running 0 4m13s 10.244.0.6 minikube <none> <none>
firstdeployment-54758c4c55-pnz5c 1/1 Running 0 5m38s 10.244.0.4 minikube <none> <none>
firstdeployment-54758c4c55-sbjbx 1/1 Running 0 16s 10.244.0.7 minikube <none> <none>
firstdeployment-54758c4c55-wxcvx 1/1 Running 0 16s 10.244.0.10 minikube <none> <none>
firstdeployment-54758c4c55-zbn7t 1/1 Running 0 5m38s 10.244.0.3 minikube <none> <none>
- Scale down to 3 replicas:
omer@k8s:$ kubectl scale deployments firstdeployment --replicas=3
deployment.apps/firstdeployment scaled
omer@k8s:$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
firstdeployment 3/3 3 3 8m27s
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
firstdeployment-54758c4c55-jp69b 1/1 Running 0 7m6s 10.244.0.6 minikube <none> <none>
firstdeployment-54758c4c55-pnz5c 1/1 Running 0 8m31s 10.244.0.4 minikube <none> <none>
firstdeployment-54758c4c55-zbn7t 1/1 Running 0 8m31s 10.244.0.3 minikube <none> <none>
- Connect one of the pod with bash:
omer@k8s:$ kubectl exec -it firstdeployment-54758c4c55-jp69b -- bash
root@firstdeployment-54758c4c55-jp69b:/# ping 10.244.0.4
bash: ping: command not found
root@firstdeployment-54758c4c55-jp69b:/# ifconfig
bash: ifconfig: command not found
- To install ifconfig, run: "apt update", "apt install net-tools"
To install ping, run: "apt install iputils-ping"
Show ethernet interfaces, ping other pod to show connectivity of Pods:
omer@k8s:$ kubectl exec -it firstdeployment-54758c4c55-jp69b -- bash
root@firstdeployment-54758c4c55-jp69b:/# apt update
root@firstdeployment-54758c4c55-jp69b:/# apt install iputils-ping
root@firstdeployment-54758c4c55-jp69b:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.244.0.6 netmask 255.255.0.0 broadcast 10.244.255.255
inet6 fe80::787b:28ff:fe4c:1782 prefixlen 64 scopeid 0x20<link>
ether 7a:7b:28:4c:17:82 txqueuelen 0 (Ethernet)
RX packets 2744 bytes 9833003 (9.3 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1335 bytes 101396 (99.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
root@firstdeployment-54758c4c55-jp69b:/# ping 10.244.0.4
PING 10.244.0.4 (10.244.0.4) 56(84) bytes of data.
64 bytes from 10.244.0.4: icmp_seq=2 ttl=64 time=0.110 ms
^C
--- 10.244.0.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1015ms
rtt min/avg/max/mdev = 0.110/0.114/0.119/0.004 ms
root@firstdeployment-54758c4c55-jp69b:/# ping 10.244.0.3
PING 10.244.0.3 (10.244.0.3) 56(84) bytes of data.
64 bytes from 10.244.0.3: icmp_seq=1 ttl=64 time=0.092 ms
^C
--- 10.244.0.3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1079ms
rtt min/avg/max/mdev = 0.049/0.070/0.092/0.021 ms
- Delete deployment:
omer@k8s:$ kubectl delete -f deployment1.yaml
deployment.apps "firstdeployment" deleted
omer@k8s:$ kubectl get pods -o wide
No resources found in default namespace.
Hands-on Sample #2 for Manual Scaling with HPA
This scenario shows:
- how to view HPA
- how to trigger scaling up/down HPA
omer@k8s:$ minikube addons enable metrics-server
💡 metrics-server is an addon maintained by Kubernetes. For any concerns contact minikube on GitHub.
You can view the list of minikube maintainers at: https://github.com/kubernetes/minikube/blob/master/OWNERS
▪ Using image registry.k8s.io/metrics-server/metrics-server:v0.7.2
🌟 The 'metrics-server' addon is enabled
omer@k8s:$ minikube addons list
|-----------------------------|----------|--------------|--------------------------------|
| ADDON NAME | PROFILE | STATUS | MAINTAINER |
|-----------------------------|----------|--------------|--------------------------------|
| ambassador | minikube | disabled | 3rd party (Ambassador) |
| amd-gpu-device-plugin | minikube | disabled | 3rd party (AMD) |
| auto-pause | minikube | disabled | minikube |
| cloud-spanner | minikube | disabled | Google |
| csi-hostpath-driver | minikube | disabled | Kubernetes |
| dashboard | minikube | disabled | Kubernetes |
| default-storageclass | minikube | enabled ✅ | Kubernetes |
| efk | minikube | disabled | 3rd party (Elastic) |
| freshpod | minikube | disabled | Google |
| gcp-auth | minikube | disabled | Google |
| gvisor | minikube | disabled | minikube |
| headlamp | minikube | disabled | 3rd party (kinvolk.io) |
| inaccel | minikube | disabled | 3rd party (InAccel |
| | | | [info@inaccel.com]) |
| ingress | minikube | disabled | Kubernetes |
| ingress-dns | minikube | disabled | minikube |
| inspektor-gadget | minikube | disabled | 3rd party |
| | | | (inspektor-gadget.io) |
| istio | minikube | disabled | 3rd party (Istio) |
| istio-provisioner | minikube | disabled | 3rd party (Istio) |
| kong | minikube | disabled | 3rd party (Kong HQ) |
| kubeflow | minikube | disabled | 3rd party |
| kubevirt | minikube | disabled | 3rd party (KubeVirt) |
| logviewer | minikube | disabled | 3rd party (unknown) |
| metallb | minikube | disabled | 3rd party (MetalLB) |
| metrics-server | minikube | enabled ✅ | Kubernetes |
| nvidia-device-plugin | minikube | disabled | 3rd party (NVIDIA) |
| nvidia-driver-installer | minikube | disabled | 3rd party (NVIDIA) |
| nvidia-gpu-device-plugin | minikube | disabled | 3rd party (NVIDIA) |
| olm | minikube | disabled | 3rd party (Operator Framework) |
| pod-security-policy | minikube | disabled | 3rd party (unknown) |
| portainer | minikube | disabled | 3rd party (Portainer.io) |
| registry | minikube | disabled | minikube |
| registry-aliases | minikube | disabled | 3rd party (unknown) |
| registry-creds | minikube | disabled | 3rd party (UPMC Enterprises) |
| storage-provisioner | minikube | enabled ✅ | minikube |
| storage-provisioner-gluster | minikube | disabled | 3rd party (Gluster) |
| storage-provisioner-rancher | minikube | disabled | 3rd party (Rancher) |
| volcano | minikube | disabled | third-party (volcano) |
| volumesnapshots | minikube | disabled | Kubernetes |
| yakd | minikube | disabled | 3rd party (marcnuri.com) |
|-----------------------------|----------|--------------|--------------------------------|
💡 To see addons list for other profiles use: `minikube addons -p name list`
- Create nginx-deployment.yaml, service and deployment:
apiVersion: v1
kind: Service
metadata:
name: nginx-deployment
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: "30m"
limits:
cpu: "80m"
- Run nginx deployment and service:
omer@k8s:$ kubectl apply -f nginx-deployment.yaml
service/nginx-deployment created
deployment.apps/nginx-deployment created
omer@k8s:$ kubectl get pods -o wide -A
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx-deployment-d99898c47-2kdq5 0/1 Running 0 2s <none> minikube <none> <none>
kube-system coredns-668d6bf9bc-7whqt 1/1 Running 0 23m 10.244.0.2 minikube <none> <none>
kube-system etcd-minikube 1/1 Running 0 23m 192.168.49.2 minikube <none> <none>
kube-system kube-apiserver-minikube 1/1 Running 0 23m 192.168.49.2 minikube <none> <none>
kube-system kube-controller-manager-minikube 1/1 Running 0 23m 192.168.49.2 minikube <none> <none>
kube-system kube-proxy-z5ncc 1/1 Running 0 23m 192.168.49.2 minikube <none> <none>
kube-system kube-scheduler-minikube 1/1 Running 0 23m 192.168.49.2 minikube <none> <none>
kube-system metrics-server-7496f689c7-jwdf4 1/1 Running 0 3m2s 10.244.0.11 minikube <none> <none>
kube-system storage-provisioner 1/1 Running 1 (23m ago) 23m 192.168.49.2 minikube <none> <none>
- Get HPA status:
omer@k8s:$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deployment Deployment/nginx-deployment cpu: <unknown>/50% 1 5 1 2m
On another terminal, create load to increase the pods automatically, it starts to create LOAD on deployment to trigger HPA:
omer@k8s:$ kubectl run -i --tty load-generator --image=busybox -- /bin/sh -c "while true; do wget -q -O- http://nginx-deployment; done"
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
After 5-10 mins later, your deployment will be autoscaled:
omer@k8s:$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deployment Deployment/nginx-deployment cpu: 68%/50% 1 5 5 28m
NOTE: After max 10mins, if there is no change, check the metrics-server config:
omer@k8s:$ kubectl edit deployment metrics-server -n kube-system
## if not presented, add these arguments to the containers.args section:
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
Automatically increased 5 pods for nginx deployment with HPA:
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
load-generator 1/1 Running 1 (6m51s ago) 15m 10.244.0.13 minikube <none> <none>
nginx-deployment-d99898c47-2kdq5 1/1 Running 0 16m 10.244.0.12 minikube <none> <none>
nginx-deployment-d99898c47-5f726 1/1 Running 0 13m 10.244.0.16 minikube <none> <none>
nginx-deployment-d99898c47-b7qtn 1/1 Running 0 13m 10.244.0.17 minikube <none> <none>
nginx-deployment-d99898c47-f9bxf 1/1 Running 0 13m 10.244.0.15 minikube <none> <none>
nginx-deployment-d99898c47-s6xvk 1/1 Running 0 13m 10.244.0.14 minikube <none> <none>
- When you delete the load:
omer@k8s:$ kubectl delete pod load-generator
pod "load-generator" deleted
- Check HPA, CPU load, it gradually decreases, replicas from 5 to 1:
omer@k8s:$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deployment Deployment/nginx-deployment cpu: 70%/50% 1 5 5 34m
omer@k8s:$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deployment Deployment/nginx-deployment cpu: 33%/50% 1 5 5 35m
omer@k8s:$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deployment Deployment/nginx-deployment cpu: 0%/50% 1 5 5 37m
omer@k8s:$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-deployment Deployment/nginx-deployment cpu: 0%/50% 1 5 1 41m
- NOTE: If not scaled down automatically, please check the deployment HPA policy, if not presented, please add it:
omer@k8s:$ kubectl edit hpa nginx-deployment
## copy and add to HorizontalPodAutoscaler
spec:
behavior:
scaleDown:
policies:
- periodSeconds: 15
type: Percent
value: 100
- periodSeconds: 15
type: Pods
value: 4
selectPolicy: Max
stabilizationWindowSeconds: 0
scaleUp:
policies:
- periodSeconds: 15
type: Percent
value: 100
- periodSeconds: 15
type: Pods
value: 4
selectPolicy: Max
stabilizationWindowSeconds: 0
maxReplicas: 5
- Finally, only 1 pod is running:
omer@k8s:$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-d99898c47-f9bxf 1/1 Running 0 24m 10.244.0.15 minikube <none> <none>
- Delete deployment and delete minikube:
omer@k8s:$ kubectl delete -f nginx-deployment.yaml
service "nginx-deployment" deleted
deployment.apps "nginx-deployment" deleted
omer@k8s:$ kubectl get pods -o wide
No resources found in default namespace.
minikube delete
🔥 Deleting "minikube" in docker ...
🔥 Deleting container "minikube" ...
🔥 Removing /home/omer/.minikube/machines/minikube ...
💀 Removed all traces of the "minikube" cluster.
Conclusion
This post focused on manual scaling and horizontal auto scaling with load. With sample scenarios, we tested scaling up/down and automatic HPA.
If you're interested in exploring other K8s components, please have a look:
K8s Tutorial - Part 1: Learn and Master Kubernetes, Kubectl, Pods, Deployments, Network, Service
Ömer Berat Sezer ・ Jan 19
If you found the tutorial interesting, I’d love to hear your thoughts in the blog post comments. Feel free to share your reactions or leave a comment. I truly value your input and engagement 😉
For other posts 👉 https://dev.to/omerberatsezer 🧐
Multi-Container Sidecar Pattern on Kubernetes with Hands-on Sample
Ömer Berat Sezer ・ Jan 24
Follow for Tips, Tutorials, Hands-On Labs for AWS, K8s, Docker, Linux, DevOps, Ansible, Machine Learning, Generative AI.
Top comments (0)