DEV Community

Chen Debra
Chen Debra

Posted on

DolphinScheduler Project Management Page Loads Slowly? Here's How to Optimize It

Image description

Problem Description

Sometimes, the Apache DolphinScheduler project management page fails to load. When inspecting in the browser, the request [http://ip:12345/dolphinscheduler/projects?pageSize=10&pageNo=1&searchVal=] times out.

Troubleshooting Steps

1. Check DolphinScheduler Logs (No Exceptions Found)

tail /home/dolphinscheduler/api-server/logs/dolphinscheduler-api.log  
Enter fullscreen mode Exit fullscreen mode

Image description

2. Restart the DolphinScheduler api-server Module

sh /home/dolphinscheduler/bin/dolphinscheduler-daemon.sh stop api-server  
sh /home/dolphinscheduler/bin/dolphinscheduler-daemon.sh start api-server  
Enter fullscreen mode Exit fullscreen mode

Result: The page still fails to load.

3. Test the Timeout Request and Observe the Response

Testing reveals that data can be returned, but it takes over ten seconds.

Image description

4. Check the Source Code for Potential Slow Queries

Code location:

Image description

5. Use Arthas to Analyze Runtime Performance

curl -O https://arthas.aliyun.com/arthas-boot.jar  
java -jar arthas-boot.jar  
# Identify the DolphinScheduler API process; in my case, it was the first one.  
Enter fullscreen mode Exit fullscreen mode

Trace the runtime to find where the delay occurs. It was identified in the queryProjectListPaging method.

trace org.apache.dolphinscheduler.api.service.impl.ProjectServiceImpl queryProjectListPaging -n 1  
Enter fullscreen mode Exit fullscreen mode

Image description

Inspecting the queryProjectListPaging method:

Image description

6. Run the SQL Query and Analyze

Testing the SQL query reveals that it is not inherently slow.

SELECT * FROM t_ds_project p  
WHERE 1=1 AND p.id IN (1, 2, 3, 5, 6, 8, 9, 12, 13, 14, 16, 17, 19, 21, 22, 23, 25, 26)  
GROUP BY p.id  
ORDER BY p.id DESC;  
Enter fullscreen mode Exit fullscreen mode

Image description

Testing the SQL being executed by DolphinScheduler, using Arthas and the CTBots MyBatis SQL Parser:

Image description

7. Identify the Slow Query

The slow SQL query was identified as follows:

SELECT  
    p.id, p.name, p.code, p.description, p.user_id, p.flag, p.create_time, p.update_time,  
    u.user_name AS user_name,  
    (SELECT COUNT(*) FROM t_ds_process_definition AS def WHERE def.project_code = p.code) AS def_count,  
    (SELECT COUNT(*) FROM t_ds_process_definition_log def, t_ds_process_instance inst  
     WHERE def.code = inst.process_definition_code  
       AND def.version = inst.process_definition_version  
       AND def.project_code = p.code  
       AND inst.state = 1) AS inst_running_count  
FROM t_ds_project p  
LEFT JOIN t_ds_user u ON u.id = p.user_id  
WHERE 1=1  
  AND p.id IN (1, 2, 3, 5, 6, 8, 9, 12, 13, 14, 16, 17, 19, 21, 22, 23, 25, 26)  
GROUP BY p.id, u.user_name  
ORDER BY p.id DESC LIMIT 100;  
Enter fullscreen mode Exit fullscreen mode

Analysis using EXPLAIN revealed that the bottleneck was in the inst_running_count subquery.

The t_ds_process_definition table contains approximately 280,000 rows, while t_ds_process_definition_log has 1,000 rows.

8. Optimize the Query with Indexes

Adding an index to the t_ds_process_instance table significantly improved performance:

CREATE INDEX qwl_diy_index  
    ON t_ds_process_instance (process_definition_code, process_definition_version, state);  
Enter fullscreen mode Exit fullscreen mode

9. Verify the Results

After adding the index, the query execution time improved, and the page loaded correctly.

Image description

By addressing the slow query with an appropriate index, the issue was resolved. If this article was helpful, feel free to like, bookmark, or share it! 😊

Top comments (0)