DEV Community

Chen Debra
Chen Debra

Posted on

Quick Start Guide for DolphinScheduler: Installation and Configuration Using Docker Compose

DolphinScheduler is a powerful open-source distributed task scheduling system widely used in the big data field for managing complex workflows. This article will provide a detailed guide on how to install and configure DolphinScheduler using Docker Compose, allowing you to quickly set up and start using the system.

1. Environment Preparation

First, ensure that Docker and Docker Compose are installed on your system. Docker is an open-source containerization platform that allows developers to package applications and their dependencies into containers, providing high portability and consistency. Docker Compose is a tool for defining and managing multi-container applications. It uses a YAML file to configure the services and provides a single command to start or stop those services.

1.1 Verifying Docker and Docker Compose Installation

You can verify if Docker and Docker Compose are installed correctly using the following commands:

docker --version
docker-compose --version
Enter fullscreen mode Exit fullscreen mode

If you see the version information, the installation was successful.

2. Downloading the DolphinScheduler Docker Compose Configuration File

Before installing and running DolphinScheduler, you need to obtain its Docker Compose configuration file. This file defines the runtime environment for DolphinScheduler and its dependent services. Follow these steps to get the configuration file:

2.1 Clone the DolphinScheduler Project

First, use Git to clone the official DolphinScheduler repository:

git clone https://github.com/apache/dolphinscheduler.git
Enter fullscreen mode Exit fullscreen mode

This will download the DolphinScheduler project to your local machine. Next, navigate to the project directory:

cd dolphinscheduler/docker
Enter fullscreen mode Exit fullscreen mode

In this directory, you will find a file named docker-compose.yml, which is the core configuration file for Docker Compose.

3. Configuring the Docker Compose File

The docker-compose.yml file defines the services needed to run DolphinScheduler, including a MySQL database, ZooKeeper cluster, and DolphinScheduler's Master and Worker nodes. You can modify this file as needed to adjust the configuration of each service.

3.1 Overview of the Docker Compose File

The docker-compose.yml file has the following basic structure:

version: '3.1'
services:
  zookeeper:
    image: zookeeper:3.5.6
    ports:
      - "2181:2181"
  mysql:
    image: mysql:5.7
    environment:
      MYSQL_ROOT_PASSWORD: root
      MYSQL_DATABASE: dolphinscheduler
    ports:
      - "3306:3306"
  dolphinscheduler-master:
    image: apache/dolphinscheduler:latest
    depends_on:
      - mysql
      - zookeeper
    ports:
      - "12345:12345"
    environment:
      - DOLPHINSCHEDULER_OPTS="-Xms512m -Xmx512m"
  dolphinscheduler-worker:
    image: apache/dolphinscheduler:latest
    depends_on:
      - dolphinscheduler-master
    environment:
      - DOLPHINSCHEDULER_OPTS="-Xms512m -Xmx512m"
Enter fullscreen mode Exit fullscreen mode

In this configuration file:

  • zookeeper: Responsible for cluster coordination and service discovery.
  • mysql: Stores the metadata for DolphinScheduler.
  • dolphinscheduler-master: The master node responsible for scheduling and managing tasks.
  • dolphinscheduler-worker: The worker node responsible for executing tasks.

4. Starting DolphinScheduler

Once the docker-compose.yml file is configured correctly, you can start DolphinScheduler using Docker Compose:

docker-compose up -d
Enter fullscreen mode Exit fullscreen mode

This command will start all the services defined in the docker-compose.yml file in the background. You can check the status of the services with the following command:

docker-compose ps
Enter fullscreen mode Exit fullscreen mode

If all services are listed as Up, DolphinScheduler has been successfully started.

5. Configuring DolphinScheduler

5.1 Initial Setup

Once started, you can access DolphinScheduler's web UI through a browser. By default, the access URL is:

http://localhost:12345
Enter fullscreen mode Exit fullscreen mode

At the login screen, use the default admin credentials (username: admin, password: admin). After logging in, you may want to change the default password to enhance system security.

5.2 Creating Projects and Tasks

In the web UI, you can create projects and define tasks. DolphinScheduler supports various task types such as Shell, Python, and SQL. You can create workflows by dragging and dropping tasks and setting dependencies between them.

5.3 System Monitoring and Log Management

DolphinScheduler offers rich monitoring and logging features. Users can view task execution statuses, monitor cluster health in real time, and access detailed execution logs, which are helpful for debugging and optimizing workflows.

6. Common Issues and Solutions

During usage, you may encounter some issues. Below are common problems and their solutions.

6.1 Service Startup Failures

If a service fails to start, you can check the logs to diagnose the issue using the following command:

docker-compose logs <service_name>
Enter fullscreen mode Exit fullscreen mode

For example:

docker-compose logs dolphinscheduler-master
Enter fullscreen mode Exit fullscreen mode

The log information can help identify errors such as database connection failures or port conflicts.

6.2 Database Connection Issues

If there are database connection failures during startup, it may be due to the MySQL service not starting in time. In this case, try restarting DolphinScheduler manually:

docker-compose restart dolphinscheduler-master dolphinscheduler-worker
Enter fullscreen mode Exit fullscreen mode

7. Advantages and Use Cases of DolphinScheduler

DolphinScheduler excels in big data processing and ETL task scheduling. Some key advantages include:

  • User-Friendly Interface: Its graphical interface makes task management and monitoring easy, lowering the barrier to entry.
  • Flexible Task Dependency Management: It allows defining complex task dependencies, making scheduling more efficient.
  • High Scalability and Availability: Its distributed architecture is suitable for large-scale data processing.

8. Conclusion

By following the above steps, you have successfully installed and configured DolphinScheduler using Docker Compose. Its powerful features and flexible configuration make it an ideal choice for distributed task scheduling. Whether for enterprise-level big data processing or small-to-medium-sized data integration projects, DolphinScheduler is a reliable solution.

If you encounter issues during usage, you can refer to DolphinScheduler's official documentation or community resources for more detailed technical support. With continued learning and exploration, you will be able to fully leverage DolphinScheduler's potential, significantly improving your workflow management.

Top comments (0)