DEV Community

Oluwajubelo
Oluwajubelo

Posted on

Strategies for Optimizing MySQL Database Performance: A Case Study of Class54

Optimizing Class54 Database to Handle Hundreds of Thousands of Records Without Latency

Managing a database to handle extensive data processing efficiently can be a daunting task, especially when dealing with hundreds of thousands of records. This was precisely the challenge we faced with the Class54 database, which served as the backbone of our application. Its inefficiencies in handling large data sets led to significant latency issues, hampering performance and user experience. Here, I will share how we identified the problems, implemented solutions, and achieved optimal performance for our system.

The Problem
Class54, a test preparation platform, provides resources for major Nigerian exams, including JAMB, WAEC, and POST-UTME of 36 major Nigerian schools. The platform hosts JAMB past questions from the year 2000 to 2024 and WAEC past questions from the same range. Initially designed to support a modest number of users, the platform grew exponentially, and so did the database’s workload. The core database, built on MySQL, started experiencing severe performance degradation when processing larger volumes of data. Tasks like fetching records for analytical dashboards, generating reports, or executing bulk updates took an unacceptably long time to complete.

The following symptoms were observed:

1. Slow Query Execution:

Some queries took several seconds or even minutes to execute.

2. High CPU Utilization:

The database server frequently reached its CPU usage limits during peak hours.

3. Locking and Deadlocks:

Concurrent operations led to frequent table locking, delaying processes and sometimes causing application crashes.

4. Inefficient Indexing:

Tables had either too many indexes or poorly optimized ones, leading to increased storage usage and query latency.

5. Scalability Issues:

The current design could not efficiently scale to handle the growing dataset without considerable cost.

Diagnosis

To address the issue, a thorough analysis of the database was conducted. This involved:

1. Query Profiling:

Using MySQL’s EXPLAIN and SHOW PROFILE commands, we identified slow-performing queries and analyzed their execution plans.

2. Schema Review:

A detailed review of the database schema highlighted unnecessary indexes, redundant columns, and inefficient table designs.

3. Workload Analysis:

Tools like MySQL Workbench and monitoring utilities (such as Prometheus and Grafana) were used to analyze the workload patterns and identify bottlenecks.

4. Concurrency Testing:

Simulated concurrent user interactions helped uncover locking issues and inefficiencies in handling parallel requests.

The Solution

Armed with insights from our analysis, we implemented the following strategies to optimize the Class54 database:

1. Index Optimization

Indexes are vital for speeding up queries, but poorly designed indexes can be counterproductive. We:

  • Removed redundant and unused indexes.

  • Created composite indexes to optimize queries filtering on multiple
    columns.

  • Reordered columns in indexes to match query patterns, ensuring
    maximum efficiency.

2. Database Normalization and Denormalization

While normalization ensures data consistency, over-normalization can lead to excessive joins, slowing down queries. We:

  • Normalized tables to reduce redundancy where feasible.

  • Denormalized heavily accessed tables by introducing summary tables for frequently aggregated data, reducing join operations.

3. Query Optimization

We rewrote inefficient queries and:

  • Replaced SELECT * with explicit column selection to minimize data
    transfer.

  • Used batch processing for bulk operations to avoid locking large
    datasets.

  • Introduced pagination for fetching records in chunks rather than
    loading everything at once.

4. Partitioning Large Tables

To manage large tables efficiently, we implemented horizontal partitioning:

  • Split large tables into smaller, more manageable partitions based
    on logical divisions (e.g., date ranges or user regions).

  • Queries were adjusted to target specific partitions, significantly
    reducing scan times.

5. Caching Frequently Accessed Data

To reduce the load on the database:

  • Implemented an in-memory cache using Redis for frequently accessed,
    read-heavy data.

  • Used query caching for recurring analytical queries, significantly
    speeding up response times.

6. Connection Pooling and Load Balancing

Efficient connection management was ensured by:

  • Implementing connection pooling to reuse database connections,
    reducing overhead from frequent opening and closing of connections.

  • Setting up read replicas for load balancing, directing read-heavy
    operations to replicas while keeping writes on the primary node.

7. Migration to a Cloud-Native Database Architecture

To future-proof the system:

  • Migrated the database to a cloud-native managed service, taking
    advantage of features like auto-scaling and read replicas.

  • Introduced a data warehouse for heavy analytical workloads,
    separating transactional and analytical processing.

Results

After these optimizations, the Class54 database achieved the following improvements:

1. Reduced Query Latency:

Average query execution time dropped from 10-15 seconds to under 200 milliseconds for most queries.

2. Improved Scalability:

The database comfortably handled peak loads with over 500,000
records, and its architecture allowed seamless scaling as data
volumes grew.

3. Lower CPU Usage:

Optimized queries and indexing significantly reduced CPU utilization, freeing up resources for other processes.

4. Enhanced User Experience:

Faster query responses and smoother application performance led to increased user satisfaction and retention.

5. Significant Cost Reduction:

Beyond the performance gains, these optimizations also translated into tangible cost savings. For instance, optimizing the student code generation function, which previously relied on inefficient parent::pluck('student_code') calls, drastically reduced the load on our Digital Ocean droplets. This single improvement, combined with the other database optimizations, allowed us to downsize our droplet configuration, resulting in a substantial reduction in our monthly infrastructure expenses. This demonstrates our commitment to both performance and cost-efficiency.

Lessons Learned

1. Proactive Monitoring:

Regularly monitoring and profiling database performance can help identify issues before they escalate.

2. Iterative Optimization:

Database optimization is an ongoing process that evolves with application requirements.

3. Balancing Normalization and Denormalization:

Striking the right balance is crucial for maintaining data integrity and performance.

4. Investing in Automation:

Using automated tools for indexing suggestions and query optimization can save considerable time and effort.

Conclusion

Optimizing the Class54 database was a challenging but rewarding experience. By combining careful analysis, strategic indexing, query optimization, and leveraging cloud-native solutions, we transformed the database into a high-performing, scalable component of our platform. These lessons and approaches can be applied to other database systems facing similar challenges, ensuring efficient handling of large datasets without compromising performance.

Top comments (8)

Collapse
 
muhammedbashua profile image
Muhammed Bashua (khingz mobash)

This article has helped me personally in understanding some MySQL performance tricks and I’ve started applying it in project that I’m currently working on.

Great article

Collapse
 
oluwajubelo1 profile image
Oluwajubelo

That's good to hear. I'm glad this article is of great help to you!!!

Collapse
 
muhammedbashua profile image
Info Comment hidden by post author - thread only accessible via permalink
Muhammed Bashua (khingz mobash)

Good work sir

Collapse
 
oluwajubelo1 profile image
Oluwajubelo

thanks Muhammed

Collapse
 
samalapsy profile image
Oluwatimilehin Samuel Alapakristi

As far as your live application is growing, visibility, monitoring, and DB optimizations should not be ignored.

Nice one @oluwajubelo1

Collapse
 
oluwajubelo1 profile image
Oluwajubelo

yes, you're correct

Collapse
 
grafricana profile image
Grafrica

This is very useful for us. We are currently going through something similar, and I find this article very useful in helping us optimize our MySQL DB performance. Thank you so much to the author.

Collapse
 
oluwajubelo1 profile image
Oluwajubelo

I'm glad you found this useful

Some comments have been hidden by the post's author - find out more