In this post we'll implement a simple kubernetes controller using the kubebuilder
Installing kubebuilder
The installation instructions can be found on kubebuilder installation docs
Install kustomize
Since kubebuilder internally relies on kustomize, hence we need to install that from the instructions here
Now we're all set up to start building our controller
Start the project
I have initialized an empty go project
go mod init cnat
Next we'll initialize the controller project
kubebuilder init --domain ishankhare.dev
Now we'll ask kubebuilder to setup all the necessary scaffolding for our project
kubebuilder create api --group cnat --version v1alpha1 --kind At
Kubebuilder will ask us whether to create Resource
and Controller
directories, to which we can answer y
each, since we want these to be created
Create Resource [y/n]
y
Create Controller [y/n]
y
Writing scaffold for you to edit...
Test install scaffold
If we now run make install
, kubebuilder should generate the base CRDs
under config/crd/bases
and a few other files for us.
Running make run
should now allow us to launch the operator locally. This is really helpful for local testing while we will be implementing and debugging the business logic of our operator code. The live logs will give us insights about how our codebase changes reflect to what's happening with the operator.
Enable status subresource
In the file api/v1alpha1/at_types.go
add the following comment above At
struct.
// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
// At is the Schema for the ats API
type At struct {
.
.
.
This will enable us to use privilege separation between spec
and status
fields.
Background
A little background on what we're going to build here. Our controller is basically a 'Cloud Native At' (cnat) for short – a cloud native version of the Unix's at
command. We provide a CRD
manifest containing 2 main things:
- A command to execute
- Time to execute the command
Our controller will basically wait and watch till the appropriate time, then would spawn a Pod
and run the given command in that Pod
. All this will it will also keep updating the Status
of that CRD
which we can view with kubectl
So let's begin
CRD
outline
This section basically describes how we want our CRD
to look like. Given the two points we listed above, we can basically represent them as:
apiVersion: cnat.ishankhare.dev/v1alpha1
kind: At
metadata:
name: at-sample
spec:
schedule: "2020-11-16T10:12:00Z"
command: "echo hello world!"
This is enough information for our controller to decide 'when' and 'what' to execute.
Defining the CRD
Now that we know what we expect out of the CRD
let's implement that. Goto the file api/v1alpha1/at_types.go
and add the following fields to the code:
const (
PhasePending = "PENDING"
PhaseRunning = "RUNNING"
PhaseDone = "DONE"
)
Also we edit the existing AtSpec
and AtStatus
structs as follows:
type AtSpec struct {
Schedule string `json:"schedule,omitempty"`
Command string `json:"command,omitempty"`
}
type AtStatus struct {
Phase string `json:"phase,omitempty"`
}
We save this as we run
$ make manifests
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/Users/ishankhare/godev/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
$ make install # this will install the generated CRD into k8s
We should now be able to see an exhaustive CustomResourceDefinition
generated for us at config/crd/bases/cnat.ishankhare.dev_ats.yaml
. For people new to CRD
's, they will basically help kubernetes identify what our kind: At
means. Unlike builtin resource types like Deployments
, Pods
etc. we need to define custom types for kubernetes through a CustomResourceDefinition
. Let's try to create an At
resource:
cat >sample-at.yaml<<EOF
apiVersion: cnat.ishankhare.dev/v1alpha1
kind: At
metadata:
name: at-sample
spec:
schedule: "2020-11-16T10:12:00Z"
command: "echo hello world!"
EOF
If we now do a kubectl apply -f
to the previously shown CRD:
kubectl apply -f sample-at.yaml
kubectl get at
NAME AGE
at-sample 124m
So now kubernetes recognizes our custom type, but it still does NOT know what to do with it. The logic part of the controller is still missing, but we have the skeleton ready to start writing our logic around it.
Implementing the Controller logic
Now we're coming to the fun part – implementing the logic for our operator. This will basically tell kubernetes what to do with our generated CRD. As few say its like "making kubernetes do tricks!".
Most of the scaffolding as been already generated for us by kubebuilder in main.go
and controllers/
. Inside the controllers/at_controller.go
file we have the function Reconcile
. Consider this is the reconcile loop's body.
A small refresher – a controller basically runs a periodic loop over the said objects and watches for any changes to those objects. When changes happen the loop body is triggered and the relevant logic can be executed. This function Reconcile
is that logical piece, the entry-point for all our controller logic.
Let's start adding a basic structuring around it:
func (r *AtReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
reqLogger := r.Log.WithValues("at", req.NamespacedName)
reqLogger.Info("=== Reconciling At")
instance := &cnatv1alpha1.At{}
err := r.Get(context.TODO(), req.NamespacedName, instance)
if err != nil {
if errors.IsNotFound(err) {
// object not found, could have been deleted after
// reconcile request, hence don't requeue
return ctrl.Result{}, nil
}
// error reading the object, requeue the request
return ctrl.Result{}, err
}
// if no phase set, default to Pending
if instance.Status.Phase == "" {
instance.Status.Phase = cnatv1alpha1.PhasePending
}
// state transition PENDING -> RUNNING -> DONE
return ctrl.Result{}, nil
}
Here we basically create a logger and an empty instance of the At
object, which will allow use to query kubernetes for existing At
objects and read/write to their status etc. The function is supposed to return the result of the reconciliation logic and an error. The ctrl.Result
object can optionally tell the reconciler to either Requeue
immediately (bool) or RequeueAfter
with a time.Duration
– again these are optional.
Now lets come to the important part, the state diagram. The image below explains what basically happens when we create a new At
:
Now, lets start implementing this logic in the function:
...
// state transition PENDING -> RUNNING -> DONE
switch instance.Status.Phase {
case cnatv1alpha1.PhasePending:
reqLogger.Info("Phase: PENDING")
diff, err := schedule.TimeUntilSchedule(instance.Spec.Schedule)
if err != nil {
reqLogger.Error(err, "Schedule parsing failure")
return ctrl.Result{}, err
}
reqLogger.Info("Schedule parsing done", "Result", fmt.Sprintf("%v", diff))
if diff > 0 {
// not yet time to execute, wait until scheduled time
return ctrl.Result{RequeueAfter: diff * time.Second}, nil
}
reqLogger.Info("It's time!", "Ready to execute", instance.Spec.Command)
// change state
instance.Status.Phase = cnatv1alpha1.PhaseRunning
The schedule.TimeUntilSchedule
function is simple to implement like so:
package schedule
import "time"
func TimeUntilSchedule(schedule string) (time.Duration, error) {
now := time.Now().UTC()
layout := "2006-01-02T15:04:05Z"
scheduledTime, err := time.Parse(layout, schedule)
if err != nil {
return time.Duration(0), err
}
return scheduledTime.Sub(now), nil
}
The next case
in our switch
body is for RUNNING
phase. This is the most complicated one and I've added comments wherever I can to explain it better.
case cnatv1alpha1.PhaseRunning:
reqLogger.Info("Phase: RUNNING")
pod := spawn.NewPodForCR(instance)
err := ctrl.SetControllerReference(instance, pod, r.Scheme)
if err != nil {
// requeue with error
return ctrl.Result{}, err
}
query := &corev1.Pod{}
// try to see if the pod already exists
err = r.Get(context.TODO(), req.NamespacedName, query)
if err != nil && errors.IsNotFound(err) {
// does not exist, create a pod
err = r.Create(context.TODO(), pod)
if err != nil {
return ctrl.Result{}, err
}
// Successfully created a Pod
reqLogger.Info("Pod Created successfully", "name", pod.Name)
return ctrl.Result{}, nil
} else if err != nil {
// requeue with err
reqLogger.Error(err, "cannot create pod")
return ctrl.Result{}, err
} else if query.Status.Phase == corev1.PodFailed ||
query.Status.Phase == corev1.PodSucceeded {
// pod already finished or errored out`
reqLogger.Info("Container terminated", "reason", query.Status.Reason,
"message", query.Status.Message)
instance.Status.Phase = cnatv1alpha1.PhaseDone
} else {
// don't requeue, it will happen automatically when
// pod status changes
return ctrl.Result{}, nil
}
We set the ctrl.SetControllerReference
that basically tells kubernetes runtime that the created Pod
is "owned" by this At
instance. This is later going to come handy for us when watching on resources created by our Controller.
We also use another external function here spawn.NewPodForCR
which is basically responsible for spawning the new Pod
for our At
custom resource spec. This is implemented as follows:
package spawn
import (
cnatv1alpha1 "cnat/api/v1alpha1"
"strings"
corev1 "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
func NewPodForCR(cr *cnatv1alpha1.At) *corev1.Pod {
labels := map[string]string{
"app": cr.Name,
}
return &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: cr.Name,
Namespace: cr.Namespace,
Labels: labels,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "busybox",
Image: "busybox",
Command: strings.Split(cr.Spec.Command, " "),
},
},
RestartPolicy: corev1.RestartPolicyOnFailure,
},
}
}
The last bits of code deal with the DONE
phase and the default
case of switch
. We also do the updation of our status
subresource outside the switch as a common part of code:
case cnatv1alpha1.PhaseDone:
reqLogger.Info("Phase: DONE")
// reconcile without requeuing
return ctrl.Result{}, nil
default:
reqLogger.Info("NOP")
return ctrl.Result{}, nil
}
// update status
err = r.Status().Update(context.TODO(), instance)
if err != nil {
return ctrl.Result{}, err
}
return ctrl.Result{}, nil
Lastly we utilize the SetControllerReference
by adding the following to the SetupWithManager
function in the same file:
err := ctrl.NewControllerManagedBy(mgr).
For(&cnatv1alpha1.At{}).
Owns(&corev1.Pod{}).
Complete(r)
if err != nil {
return err
}
return nil
The part specifying Owns(&corev1.Pod{})
is important and tells the controller manager that pods created by this controller also needs to be watched for changes.
If you want to have a look at the entire final code implementation, head over the this github repo – ishankhare07/kubebuilder-controller.
With this in place, we are now done with implementation of our controller, we can run and test it on a cluster in our current kube context using make run
go: creating new go.mod: module tmp
go: found sigs.k8s.io/controller-tools/cmd/controller-gen in sigs.k8s.io/controller-tools v0.2.5
/Users/ishankhare/godev/bin/controller-gen object:headerFile="hack/boilerplate.go.txt" paths="./..."
go fmt ./...
go vet ./...
/Users/ishankhare/godev/bin/controller-gen "crd:trivialVersions=true" rbac:roleName=manager-role webhook paths="./..." output:crd:artifacts:config=config/crd/bases
go run ./main.go
2020-11-18T05:05:45.531+0530 INFO controller-runtime.metrics metrics server is starting to listen {"addr": ":8080"}
2020-11-18T05:05:45.531+0530 INFO setup starting manager
2020-11-18T05:05:45.531+0530 INFO controller-runtime.manager starting metrics server {"path": "/metrics"}
2020-11-18T05:05:45.531+0530 INFO controller-runtime.controller Starting EventSource {"controller": "at", "source": "kind source: /, Kind="}
2020-11-18T05:05:45.732+0530 INFO controller-runtime.controller Starting EventSource {"controller": "at", "source": "kind source: /, Kind="}
2020-11-18T05:05:46.232+0530 INFO controller-runtime.controller Starting Controller {"controller": "at"}
2020-11-18T05:05:46.232+0530 INFO controller-runtime.controller Starting workers {"controller": "at", "worker count": 1}
2020-11-18T05:05:46.232+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:46.232+0530 INFO controllers.At Phase: PENDING {"at": "cnat/at-sample"}
2020-11-18T05:05:46.232+0530 INFO controllers.At Schedule parsing done {"at": "cnat/at-sample", "Result": "-14053h23m46.232658s"}
2020-11-18T05:05:46.232+0530 INFO controllers.At It's time! {"at": "cnat/at-sample", "Ready to execute": "echo YAY!"}
2020-11-18T05:05:46.436+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
2020-11-18T05:05:46.443+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:46.443+0530 INFO controllers.At Phase: RUNNING {"at": "cnat/at-sample"}
2020-11-18T05:05:46.718+0530 INFO controllers.At Pod Created successfully {"at": "cnat/at-sample", "name": "at-sample"}
2020-11-18T05:05:46.718+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
2020-11-18T05:05:46.722+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:46.722+0530 INFO controllers.At Phase: RUNNING {"at": "cnat/at-sample"}
2020-11-18T05:05:46.722+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
2020-11-18T05:05:46.778+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:46.778+0530 INFO controllers.At Phase: RUNNING {"at": "cnat/at-sample"}
2020-11-18T05:05:46.778+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
2020-11-18T05:05:46.846+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:46.847+0530 INFO controllers.At Phase: RUNNING {"at": "cnat/at-sample"}
2020-11-18T05:05:46.847+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
2020-11-18T05:05:49.831+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:49.831+0530 INFO controllers.At Phase: RUNNING {"at": "cnat/at-sample"}
2020-11-18T05:05:49.831+0530 INFO controllers.At Container terminated {"at": "cnat/at-sample", "reason": "", "message": ""}
2020-11-18T05:05:50.054+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
2020-11-18T05:05:50.056+0530 INFO controllers.At === Reconciling At {"at": "cnat/at-sample"}
2020-11-18T05:05:50.056+0530 INFO controllers.At Phase: DONE {"at": "cnat/at-sample"}
2020-11-18T05:05:50.056+0530 DEBUG controller-runtime.controller Successfully Reconciled {"controller": "at", "request": "cnat/at-sample"}
If you're interested to see the output command we can run:
$ kubectl logs -f at-sample
hello world!
This post was first published on my personal blog ishankhare.dev
Top comments (0)