CiCube for CICube

Posted on Nov 11 • Originally published at cicube.io

How to Use Kubernetes CronJob

#kubernetes #devops #security

Introduction

The role of Kubernetes CronJob is a straightforward and powerful way to schedule jobs running inside a Kubernetes cluster to run periodically. This article covers how you can create, manage, and troubleshoot CronJobs by covering syntax, configurations, and practical implementations. You'll have a good overview by the end of how to effectively use CronJobs in your Kubernetes environments.

What is a CronJob?

A Kubernetes CronJob is a powerful construct, much akin to anything one might already be accustomed to with Unix systems. It schedules and manages running Jobs at specified times or on a specified recurring schedule. For example, if one wishes for something to execute once a day at 2 AM, one would easily describe that using the CronJob resource. The format of the scheduling is important: Kubernetes abides by the standard Cron syntax, where you can specify minute, hour, day, month, and day of the week. For example, the line 0 2 * * * would trigger a Job every day at 2 AM.

One of the advantages of a CronJob is that it maintains the right relationship between the Job that is scheduled and the actual execution, such that the Job only runs at particular intervals - the so-called organized execution. However, one should be very careful with naming CronJobs. The name needs to follow the DNS subdomain convention, not be longer than 52 characters because Kubernetes adds some more characters to this name when it is naming Pods. Badly configured names may potentially result in naming collisions, problems that affect your scheduled tasks.

Monitoring GitHub Actions Workflows

CICube is a GitHub Actions monitoring tool that provides you with detailed insights into your workflows to further optimize your CI/CD pipeline. With CICube, you will be able to track your workflow runs, understand where the bottlenecks are, and tease out the best from your build times. Go to cicube.io now and create a free account to better optimize your GitHub Actions workflows!

Creating a Simple CronJob

In this section, I will create a very simple CronJob manifest that prints the current date and a greeting message. In this example, we will define our CronJob using a YAML configuration file, then execute it to view the output.

Here is an example YAML manifest:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: Hello
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - date; echo 'Hello from the Kubernetes cluster'
          restartPolicy: OnFailure

Below is a manifest that schedules a job to execute every minute. Upon execution, the job will print the current date and the message, "Hello from the Kubernetes cluster." Let's take this configuration and apply it using kubectl; then we can view some logs:

kubectl apply -f cronjob.yaml

The expected output should be something that would confirm the creation of the CronJob:

cronjob.batch/hello created

To see the logs of the executed Job:

kubectl get jobs
kubectl logs <job-name>

Here, replace <job-name> with whatever name Kubernetes generated for the Job that resulted from this CronJob. We can use this to verify our CronJob executed as we'd expect it to.

How Schedule Syntax Works

The Kubernetes CronJobs schedule syntax is essential in setting when the Jobs should run. It closely resembles what you might see from standard Unix/Linux cron jobs. Going to the very basic, it consists of five space-separated fields in the order of minute, hour, day of the month, month, and day of the week. Here's what that looks like:

# █ # minute (0 - 59)
# █ hour (0 - 23)
# █ day of the month (1 - 31)
# █ month (1 - 12)
# █ day of the week (0 - 6)
#  OR sun, mon, tue, wed, thu, fri, sat
# 
# * * * * *

For example, 0 3 * * 1 in the expression above informs that the job will be executed at 3 AM every Monday.

Special characters you can use include:

* means "every minute" or "every hour"
, to separate multiple values (e.g., 1,2,3)
- for ranges, such as 1-5 for Monday to Friday
/ for intervals, that is, */2 means every two hours

Also, you can use macros such as @hourly as a shorthand:

@yearly - Execute once in a yearly period, every January 1 at 00:00
@monthly - Once a month at 00:00 of the first day of the month
@weekly - Runs every Sunday at midnight
@daily - Once a day at midnight
@hourly - Runs at the beginning of every hour

You can leverage online resources like crontab.guru to generate CronJob schedule expressions easily. This website explains complex schedules and validates the definitions you have built. Mastering these syntax rules is key when working with Kubernetes CronJobs. Doing so makes sure your tasks execute precisely when you want them to.

Troubleshooting CronJobs

In this chapter, let's learn how to handle and troubleshoot CronJob instances related to missed schedules or failed ones. Graceful error handling and correctly setting .spec.startingDeadlineSeconds are two critical components in keeping CronJobs reliable.

Define .spec.startingDeadlineSeconds - it specifies a duration, in seconds, that describes the maximum time after which a Job is considered failed after its scheduled time; such a run will be skipped by Kubernetes if this duration is surpassed. For example, startingDeadlineSeconds: 300 means a Job that cannot start in 5 minutes would be skipped.

Below is the snippet of a sample CronJob configuration where the startingDeadlineSeconds is set.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: my-backup-job
spec:
  schedule: "0 2 * * *"
  startingDeadlineSeconds: 300
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-image:latest
            command:
            - /bin/sh
            - -c
            - echo "Backing up data"
          restartPolicy: OnFailure

You can use the following command in order to see how errors are reported:

kubectl get jobs

In case of failure after a CronJob has run, the status of that job will reflect an error. You can get the detailed information by:

kubectl describe job <job-name>

Replace <job-name> with the name of your job. This will display why the job failed. In short, knowing how to set .spec.startingDeadlineSeconds and successfully handle failures are the important set of skills when it comes to maintaining running CronJobs. Monitoring the statuses and learning how to troubleshoot jobs will keep your automated workflows lean and efficient.

Understanding Concurrency in CronJobs

Certain important aspects of writing CronJobs in Kubernetes deal with managing concurrent executions so that inappropriate overlap of tasks does not occur. The .spec.concurrencyPolicy field describes how the parallel runs of a Job should be handled, with three major categories:

Allow (default): This setting will enable several Jobs to run in concurrency. However, it can also provide an opportunity for the realization of partially executed Jobs in case running a Job takes more time than expected. For example:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: concurrent-allow
spec:
  schedule: "* * * * *"
  concurrencyPolicy: Allow
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            command:
            - /bin/bash
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Forbid: With this, if a Job is still running and the next schedule execution time arrives, then Kubernetes will skip the new Job run. This prevents overlapping executions which could save resources and avoid possible race conditions. For instance:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: concurrent-forbid
spec:
  schedule: "* * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            command:
            - /bin/bash
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Replace: This replaces the presently running Job with the new Job once the scheduled time is reached so that only the newest execution gets to run. It comes in handy if the formerly executed thing is no longer valid. Example:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: concurrent-replace
spec:
  schedule: "* * * * *"
  concurrencyPolicy: Replace
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            command:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

To see these policies in action, you can run these CronJobs with kubectl apply -f <filename>.yaml, then watch the running Jobs using:

kubectl get jobs

This will let you see how different concurrency policies affect the scheduling, and the actual running of your CronJobs.

Conclusion

Mastering Kubernetes CronJobs would be a game-changer in the management of applications and automation of tasks in general within a cluster. You can create robust and efficient automated workflows with this knowledge about scheduling syntax, job templates, and limitations. Equipped with practical examples, troubleshooting tips-you are ready to go for implementation in your DevOps practices.

DEV Community

How to Use Kubernetes CronJob

Introduction

What is a CronJob?

Monitoring GitHub Actions Workflows

Creating a Simple CronJob

How Schedule Syntax Works

Troubleshooting CronJobs

Understanding Concurrency in CronJobs

Conclusion

Top comments (0)

Read next

7 Ways to Boost Your Cybersecurity Career with CISSP Certification

Kubernetes: Install Tools

Managing Dependabot PRs with dependabot-pr-manager 🤖

Using SSH to Connect Local Git to Remote Repositories