DEV Community

Cover image for Stop One-By-One Inserts! Bulk Insert Data In Rails With insert_all Or activerecord-import ⚡
Anh Nguyen
Anh Nguyen

Posted on

Stop One-By-One Inserts! Bulk Insert Data In Rails With insert_all Or activerecord-import ⚡

If your CSV import is slower than dial-up internet, discover how bulk inserts can rocket your Rails performance. 🐢

1. Why Bulk Inserts Matter?

When dealing with large datasets for example, you want to import a CSV containing more than a thousand of tasks from your department, inserting records one by one (.create or .save) can drastically slow down performance and cause database bottlenecks

Imagine inserting 100,000 records—doing it traditionally means 100,000 separate queries! Instead, bulk inserting reduces this to a single efficient query, saving time and resources.

2. Prerequisites

Before diving in, ensure:

  • You're using Rails 6+ for insert_all
  • You have the activerecord-import gem installed if using an older version which supports Rails 3.x

3. Step-by-Step Guide

3.1 The Problem With Each-By-Each Inserts

For example, you want to create 100k records of tasks by a naive way, this is how it look like:

  # Create 100k records each times
  100_000.times do |i|
    Task.create!({ name: "Task #{i}", description: "This is description of task #{i}", is_finished: false })
  end
Enter fullscreen mode Exit fullscreen mode

I did some benchmark here with this snippet:

  execution_time = Benchmark.measure do
    100_000.times do |i|
      Task.create!({ name: "Task #{i}", description: "This is description of task #{i}", is_finished: false })
    end
  end

  puts "Execution time: #{execution_time.real} seconds"
Enter fullscreen mode Exit fullscreen mode

Benchmark One-by-One

Image shows that the execution of record creation takes 20.99 seconds

✅ Pros:

  1. Models validation or callbacks works - the name field is not null but there is a call with null name, the insert will be raise with errors
  2. Older Rails versions < 6.0 support - Legacy project without bulk insert support.

❌ Cons:

  1. Performance Issue - Each insert requires a separate database transaction, increasing overhead.
  2. Increased Network Traffic - 100k request will be made into the database which increases network costs.

3.2 Solution 1: Use insert_all

The insert_all method allows you to insert multiple records at once, significantly boosting performance.

Example: Using insert_all for create 100k tasks

  tasks = []

  # Initialize 100k tasks hash then add to tasks array
  100_000.times do |i|
    tasks << { name: "Task #{i}", description: "This is description of task #{i}", is_finished: false }
  end

  # Bulk insert all the initialized tasks
  Task.insert_all(tasks)
Enter fullscreen mode Exit fullscreen mode

Insert All

✅ Pros:

  1. Faster than .create - Single SQL query
  2. Rails built-in method - Bulk operations directly in Rails without extra gems.

❌ Cons:

  1. Does not accept ActiveRecord models – Only works with raw hashes.
  2. Bypasses model validations and callbacks – Data integrity must be handled manually.
  3. Cannot handle associations automatically – Requires extra queries to fetch related IDs.

Example: Inserting Tasks that Belong to Department

  departments = [{ name: "Marketing" }] 
  department = Department.insert_all(departments) 
  department_id = department.id # Error raised

  tasks = [{ name: "Task 0", description: "This is the task description", is_finished: false, department_id: ??? }] # missing department_id here
Enter fullscreen mode Exit fullscreen mode

No ID returns error exception

Image shows that no id can be returned after using insert_all

👉 Fix: Manually retrieve department_id before inserting tasks, adding an extra query.

  department = Department.find_by(name: "Marketing") # cost a SQL query
  tasks = [{ name: "Task 0", description: "This is the task description", is_finished: false, department_id: department.id }]
Enter fullscreen mode Exit fullscreen mode

3.3 Solution 2: Best of both worlds - Use activerecord-import gem

The import method allows you to insert models with associations

Example: Using import to create 100k tasks in 1 department

  department = Department.new(name: "Marketing")
  tasks = []

  100_000.times do |i|
    tasks << Task.new({ name: "Task #{i}", description: "This is description of task #{i}", is_finished: false })
  end
  department.tasks = tasks

  execution_time = Benchmark.measure do
    Department.import [department]
  end

  puts "Execution time: #{execution_time.real} seconds"
Enter fullscreen mode Exit fullscreen mode
  • Able to import 1 department with 100k tasks with the use of Active Record Import
    Association Import

  • Observe the performance compared to insert_all is similar to each other ( About 1.44 seconds )

Activerecord-import benchmark

✅ Pros:

  • Works with raw columns and arrays of values (fastest)
  • Works with model objects (faster)
  • Performs validations (fast)
  • Performs on duplicate key updates (requires MySQL, SQLite 3.24.0+, or Postgres 9.5+)

❌ Cons:

  • Need an extra gem installed
  • ActiveRecord callbacks related to creatingupdating, or destroying records (other than before_validation and after_validation) will NOT be called when calling the import method. ( calling separately with run_callbacks as recommended )

4. Conclusion

One rule to remember when inserting a large number of records ( Example: Import a large number of records from a CSV file ) is to avoid creating records one-by-one. Instead, we can consider using activerecord-import or insert_all for a great performance.

5. Reference

Top comments (0)