Keeping a close eye on your Kubernetes cluster is essential to detect issues early. In this section, I'll explain how I monitor my cluster.
A ridiculously over-engineered setup for a homelab
At this point, I know I'm over-engineering the cluster; I'm the only one using it, and my portfolio hosted on it gets minimal traffic, 5 visits per month at most (measured by Google Search Console). So, if it goes down, no one will be impacted.
The reason I am doing all this is to learn things. I LEARNED A LOT (and invested a lot of time too)…
I chose to deploy Prometheus with Kibana using the Kubernetes prometheus operator which greatly simplifies the deployment process. This stack is widely used in the professional world, so I wanted to explore it.
- Prometheus gather data from nodes and pods.
- Then, Grafana displays the data.
- I could also set up AlertManager to configure alerts via email, webhooks, SMS or whatever, but I haven't gone that far yet.
Prometheus operator
The official documentation offers three ways to install the Prometheus Operator. I use a GitOps approach with ArgoCD to deploy everything to the cluster and chose the Helm chart for installation.
Chart.yaml
apiVersion: v2
name: prometheus-subchart
type: application
version: 60.3.0
appVersion: "60.3.0"
dependencies:
- name: kube-prometheus-stack
version: 60.3.0
repository: https://prometheus-community.github.io/helm-charts
values.yaml
kube-prometheus-stack:
namespaceOverride: prometheus-stack
defaultRules:
rules:
alertmanager: false
etcd: false # microk8s does not use etcd if HA is enabled
windows: false
alertmanager:
enabled: false
prometheus:
prometheusSpec:
retention: 30d
grafana:
ingress:
enabled: true
ingressClassName: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- grafana.mydomain
tls:
- secretName: certificate-prod-grafana
hosts:
- grafana.mydomain
sidecar:
datasources:
alertmanager:
enabled: false
kubeEtcd:
enabled: false # microk8s does not use etcd if HA is enabled
And voila!
Grafana is accessible through grafana.mydomain
with the default credentials admin:prom-operator
(change it!):
Multiple dashboards are also configured by default with the operator.
You can now configure metric scraping for your pods and services using PodMonitor
and ServiceMonitor
CRDs thanks to the operator.
Top comments (0)