Amr Saafan for Nile Bits

Posted on Jun 29 • Originally published at nilebits.com

Kubernetes as a Database? What You Need to Know About CRDs

#kubernetes #crds #yaml #go

Within the fast developing field of cloud-native technologies, Kubernetes has become a potent tool for containerized application orchestration. But among developers and IT specialists, "Is Kubernetes a database?" is a frequently asked question. This post explores this question and offers a thorough description of Custom Resource Definitions (CRDs), highlighting their use in the Kubernetes ecosystem. We hope to make these ideas clear and illustrate the adaptability and power of Kubernetes in stateful application management with thorough code examples and real-world applications.

Introduction to Kubernetes

Kubernetes, often abbreviated as K8s, is an open-source platform designed to automate the deployment, scaling, and operation of application containers. Initially developed by Google, Kubernetes has become the de facto standard for container orchestration, supported by a vibrant community and a wide range of tools and extensions.

Core Concepts of Kubernetes

Before diving into the specifics of CRDs and the question of Kubernetes as a database, it's essential to understand some core concepts:

Pods: The smallest deployable units in Kubernetes, representing a single instance of a running process in a cluster.

Nodes: The worker machines in Kubernetes, which can be either virtual or physical.

Cluster: A set of nodes controlled by the Kubernetes master.

Services: An abstraction that defines a logical set of pods and a policy by which to access them.

Kubernetes as a Database: Myth or Reality?

Kubernetes itself is not a database. It is an orchestration platform that can manage containerized applications, including databases. However, the confusion often arises because Kubernetes can be used to deploy and manage database applications effectively.

Understanding Custom Resource Definitions (CRDs)

Custom Resource Definitions (CRDs) extend the Kubernetes API to allow users to manage their own application-specific custom resources. This capability makes Kubernetes highly extensible and customizable to fit various use cases.

What Are CRDs?

CRDs enable users to define custom objects that behave like built-in Kubernetes resources. For instance, while Kubernetes has built-in resources like Pods, Services, and Deployments, you can create custom resources such as "MySQLCluster" or "PostgreSQLBackup."

Creating a CRD

To create a CRD, you need to define it in a YAML file and apply it to your Kubernetes cluster. Here's a basic example:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: myresources.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
  scope: Namespaced
  names:
    plural: myresources
    singular: myresource
    kind: MyResource
    shortNames:
      - mr

Applying this YAML file with kubectl apply -f myresource-crd.yaml will create the custom resource definition in your cluster.

Managing Custom Resources

Once the CRD is created, you can start managing custom resources as you would with native Kubernetes resources. Here’s an example of a custom resource instance:

apiVersion: example.com/v1
kind: MyResource
metadata:
  name: myresource-sample
spec:
  foo: bar
  count: 10

You can create this custom resource with:

kubectl apply -f myresource-instance.yaml

Using CRDs for Stateful Applications

Custom Resource Definitions are particularly useful for managing stateful applications, including databases. They allow you to define the desired state of a database cluster, backup policies, and other custom behaviors.

Example: Managing a MySQL Cluster with CRDs

Consider a scenario where you need to manage a MySQL cluster with Kubernetes. You can define a custom resource to represent the MySQL cluster configuration:

apiVersion: example.com/v1
kind: MySQLCluster
metadata:
  name: my-mysql-cluster
spec:
  replicas: 3
  version: "5.7"
  storage:
    size: 100Gi
    class: standard

With this CRD, you can create, update, and delete MySQL clusters using standard Kubernetes commands, making database management more straightforward and integrated with the rest of your infrastructure.

Advanced CRD Features

CRDs offer several advanced features that enhance their functionality and integration with the Kubernetes ecosystem.

Validation Schemas

You can define validation schemas for custom resources to ensure that only valid configurations are accepted. Here’s an example of adding a validation schema to the MySQLCluster CRD:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: mysqlclusters.example.com
spec:
  group: example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                replicas:
                  type: integer
                  minimum: 1
                version:
                  type: string
                storage:
                  type: object
                  properties:
                    size:
                      type: string
                    class:
                      type: string
  scope: Namespaced
  names:
    plural: mysqlclusters
    singular: mysqlcluster
    kind: MySQLCluster
    shortNames:
      - mc

Custom Controllers

To automate the management of custom resources, you can write custom controllers. These controllers watch for changes to custom resources and take actions to reconcile the actual state with the desired state.

Here’s an outline of how you might write a controller for the MySQLCluster resource:

package main

import (
    "context"
    "log"

    mysqlv1 "example.com/mysql-operator/api/v1"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/client-go/kubernetes/scheme"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    "sigs.k8s.io/controller-runtime/pkg/manager"
)

type MySQLClusterReconciler struct {
    client.Client
    Scheme *runtime.Scheme
}

func (r *MySQLClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    var mysqlCluster mysqlv1.MySQLCluster
    if err := r.Get(ctx, req.NamespacedName, &mysqlCluster); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // Reconciliation logic goes here

    return ctrl.Result{}, nil
}

func main() {
    mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
        Scheme: scheme.Scheme,
    })
    if err != nil {
        log.Fatalf("unable to start manager: %v", err)
    }

    if err := (&MySQLClusterReconciler{
        Client: mgr.GetClient(),
        Scheme: mgr.GetScheme(),
    }).SetupWithManager(mgr); err != nil {
        log.Fatalf("unable to create controller: %v", err)
    }

    if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
        log.Fatalf("unable to run manager: %v", err)
    }
}

Versioning

CRDs support versioning, allowing you to manage different versions of your custom resources and migrate between them smoothly. This is crucial for maintaining backward compatibility and evolving your APIs over time.

Case Study: Kubernetes Operators for Databases

Kubernetes Operators leverage CRDs and custom controllers to automate the management of complex stateful applications like databases. Let's explore a real-world example: the MySQL Operator.

The MySQL Operator

The MySQL Operator simplifies the deployment and management of MySQL clusters on Kubernetes. It uses CRDs to define the desired state of the MySQL cluster and custom controllers to handle tasks like provisioning, scaling, and backups.

Defining the MySQLCluster CRD

The MySQL Operator starts by defining a CRD for the MySQLCluster resource, as shown earlier. This CRD includes fields for specifying the number of replicas, MySQL version, storage requirements, and more.

Writing the MySQLCluster Controller

The controller for the MySQLCluster resource continuously watches for changes to MySQLCluster objects and reconciles the actual state with the desired state. For example, if the number of replicas is increased, the controller will create new MySQL instances and configure them to join the cluster.

Code Example: Scaling a MySQL Cluster

Here’s a simplified version of the controller logic for scaling a MySQL cluster:

func (r *MySQLClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    var mysqlCluster mysqlv1.MySQLCluster
    if err := r.Get(ctx, req.NamespacedName, &mysqlCluster); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }

    // Fetch the current number of MySQL pods
    var pods corev1.PodList
    if err := r.List(ctx, &pods, client.InNamespace(req.Namespace), client.MatchingLabels{
        "app":  "mysql",
        "role": "master",
    }); err != nil {
        return ctrl.Result{}, err
    }

    currentReplicas := len(pods.Items)
    desiredReplicas := mysqlCluster.Spec.Replicas

    if currentReplicas < desiredReplicas {
        // Scale up
        for i := currentReplicas; i < desiredReplicas; i++ {
            newPod := corev1.Pod{
                ObjectMeta: metav1.ObjectMeta{
                    Name:      fmt.Sprintf("mysql-%d", i),
                    Namespace: req.Namespace,
                    Labels: map[string

]string{
                        "app":  "mysql",
                        "role": "master",
                    },
                },
                Spec: corev1.PodSpec{
                    Containers: []corev1.Container{
                        {
                            Name:  "mysql",
                            Image: "mysql:5.7",
                            Ports: []corev1.ContainerPort{
                                {
                                    ContainerPort: 3306,
                                },
                            },
                        },
                    },
                },
            }
            if err := r.Create(ctx, &newPod); err != nil {
                return ctrl.Result{}, err
            }
        }
    } else if currentReplicas > desiredReplicas {
        // Scale down
        for i := desiredReplicas; i < currentReplicas; i++ {
            podName := fmt.Sprintf("mysql-%d", i)
            pod := &corev1.Pod{
                ObjectMeta: metav1.ObjectMeta{
                    Name:      podName,
                    Namespace: req.Namespace,
                },
            }
            if err := r.Delete(ctx, pod); err != nil {
                return ctrl.Result{}, err
            }
        }
    }

    return ctrl.Result{}, nil
}

Benefits of Using Kubernetes Operators

Kubernetes Operators, built on CRDs and custom controllers, provide several benefits for managing stateful applications:

Automation: Operators automate routine tasks such as scaling, backups, and failover, reducing the operational burden on administrators.

Consistency: By encapsulating best practices and operational knowledge, Operators ensure that applications are managed consistently and reliably.

Extensibility: Operators can be extended to support custom requirements and integrate with other tools and systems.

Conclusion

Although Kubernetes is not a database in and of itself, it offers a strong framework for installing and administering database applications. A strong addition to the Kubernetes API, bespoke Resource Definitions (CRDs) allow users to design and manage bespoke resources that are suited to their particular requirements.

You may build Kubernetes Operators that automate the administration of intricate stateful applications, such as databases, by utilizing CRDs and custom controllers. This method offers more flexibility and customization, improves consistency, and streamlines processes.

This article has covered the foundations of CRDs, shown comprehensive code examples, and shown how stateful applications may be efficiently managed with CRDs. To fully utilize Kubernetes, you must comprehend and make use of CRDs, regardless of whether you are implementing databases or other sophisticated services on this potent orchestration platform.

As Kubernetes develops further, its adaptability to a variety of use cases and needs is ensured by its expansion through CRDs and Operators. Keep investigating and experimenting with these functionalities as you progress with Kubernetes to open up new avenues for your infrastructure and apps.

DEV Community

Kubernetes as a Database? What You Need to Know About CRDs

Top comments (0)

Read next

Amazon Elastic Kubernetes Service now supports using NVIDIA and AWS Neuron accelerated instance types with AL2023

🛠️ Battle of the Backend: Go vs Node.js vs Python – Which One Reigns Supreme in 2024? 🚀

VS Code YAML Plugin Setup for Kubernetes Beginners

Open-Source Development is Amazing!