pawan deore

Posted on Mar 9

Processing 1 Million Records in Node.js and MySQL Efficiently

#mysql #node #webdev #programming

Handling large datasets in Node.js with MySQL can be challenging due to memory constraints and performance bottlenecks. Processing 1 million records efficiently requires optimizing queries, using streaming, and ensuring proper indexing. In this article, we'll go through best practices and code examples for handling large datasets efficiently.

🚀 Challenges in Processing Large Data in Node.js

Memory Consumption -- Fetching all records at once can overload memory.
Query Performance -- Large dataset queries can slow down if not optimized.
Concurrency & Bottlenecks -- Processing data in batches is necessary to avoid blocking the event loop.

Learn NodeJs by Building Projects

🛠️ Solutions to Process 1 Million Records

1. Use Pagination or Batching

Instead of retrieving all records at once, process them in smaller chunks.

Example: Fetching Data in Batches

const mysql = require('mysql2/promise');

async function processLargeDataset() {
    const connection = await mysql.createConnection({
        host: 'localhost',
        user: 'root',
        password: 'password',
        database: 'test_db'
    });

    const batchSize = 10000; // Process 10K records at a time
    let offset = 0;
    let rows;

    do {
        [rows] = await connection.execute(
            `SELECT * FROM large_table ORDER BY id LIMIT ?, ?`,
            [offset, batchSize]
        );

        if (rows.length) {
            console.log(`Processing ${rows.length} records...`);
            await processData(rows);
        }

        offset += batchSize;
    } while (rows.length > 0);

    await connection.end();
}

async function processData(records) {
    for (const record of records) {
        // Perform operations like transformation, writing to another table, etc.
    }
}

processLargeDataset();

Why is this effective?

Uses LIMIT ? OFFSET ? to fetch records in chunks.
Prevents memory overload by processing a limited set of records at a time.

2. Use MySQL Streaming for Large Data

Instead of loading everything in memory, use MySQL's streaming capability.

Example: Using MySQL Streams

const mysql = require('mysql2');

const connection = mysql.createConnection({
    host: 'localhost',
    user: 'root',
    password: 'password',
    database: 'test_db'
});

const query = connection.query('SELECT * FROM large_table');

query
  .stream()
  .on('data', (row) => {
      console.log('Processing row:', row);
      // Perform processing on each row
  })
  .on('end', () => {
      console.log('All rows processed.');
      connection.end();
  });

Why is this better?

Uses streaming, so only a few records are kept in memory at a time.
Faster than traditional batch processing when dealing with large datasets.

3. Optimize MySQL Queries

If the dataset is too large, make sure queries are optimized:

Use Indexing: Ensure that columns used in WHERE, ORDER BY, and JOIN clauses are indexed.
*Avoid SELECT **: Fetch only the required columns to reduce memory usage.
Use MySQL Partitioning: If applicable, partition large tables for better performance.

4. Bulk Insert for Faster Processing

If the goal is to transfer or update large datasets, use bulk inserts instead of inserting records one by one.

Example: Bulk Insert

async function bulkInsert(records) {
    const connection = await mysql.createConnection({
        host: 'localhost',
        user: 'root',
        password: 'password',
        database: 'test_db'
    });

    const values = records.map(record => [record.id, record.name, record.value]);

    await connection.query(
        `INSERT INTO new_table (id, name, value) VALUES ?`,
        [values]
    );

    await connection.end();
}

Why is this better?

A single query inserts multiple records, reducing query overhead.
Improves performance when handling large data migrations.

✅ Conclusion

Processing 1 million records in Node.js with MySQL requires batch processing, streaming, query optimization, and bulk operations. Using the right approach ensures better performance and prevents memory crashes.

Key Takeaways:

✔ Use batch processing (LIMIT OFFSET) for handling records in chunks.\
✔ Use MySQL streaming to avoid loading all records into memory.\
✔ Optimize queries with indexes and selective column fetching.\
✔ Use bulk inserts to speed up data migration or updates.

By following these best practices, you can efficiently handle large datasets in Node.js without running into memory issues or slow query performance. 🚀

DEV Community

Processing 1 Million Records in Node.js and MySQL Efficiently

🚀 Challenges in Processing Large Data in Node.js

🛠️ Solutions to Process 1 Million Records

1. Use Pagination or Batching

Example: Fetching Data in Batches

2. Use MySQL Streaming for Large Data

Example: Using MySQL Streams

3. Optimize MySQL Queries

4. Bulk Insert for Faster Processing

Example: Bulk Insert

✅ Conclusion

Key Takeaways:

Top comments (0)

Read next

🚀 Embrace the Power of Functional Programming in JavaScript and TypeScript 🚀

Which is the best way to learn Dynamic Programming for interviews?

REACT-JS Cheatsheet

React useTransition Hook : The Tool for Non-Blocking Updates