I have recently heard about Squashing Rails migration. So I wanted to experiment with it since I love learning how things work internally, and I know little about Rails migration. This is the time and place to learn about migration and try to squash some!
Migration in general
Before everything else, it might be silly, but let's redefine and understand what migrations are and why they are helpful.
If you look at the rails documentation,
migrations are a convenient way to alter your database schema over time in a consistent manner.
You can think of each migration as being a new 'version' of the database.
Moreover, they are most often written in code that enables reviews and, as we said, versioning of the databases changes.
Rails migration internals
There are several essential things to understand when you are running a migration.
First, in your database there is a table maintained and used by Rails, which has nothing to do with your models. This table is schema_migrations
; there is not much in this table except the version of the migration you have run.
You can access this table through a model in your Rails app:
For example, go to the model folder and create a schema_migration.rb
file.
class SchemaMigration < ActiveRecord::Base
end
An implementation already exists; you can see its documentation here. But you won't be able to access it :)
Now you can use it like any other model :
irb(main):001:0> SchemaMigration.all
SchemaMigration Load (0.6ms) SELECT "schema_migrations".* FROM "schema_migrations"
=> []
Nothing in here; it is normal for now. We have not done any migration; let's create one and see.
> rails g model user name:string
invoke active_record
create db/migrate/20230723165832_create_users.rb
create app/models/user
Now we can add a user table that has timestamps a name column:
class CreateUsers < ActiveRecord::Migration[7.0]
def change
create_table :users do |t|
t.string :name
t.timestamps
end
end
end
If you do not run your migration, there will still have nothing :
irb(main):001:0> SchemaMigration.all
SchemaMigration Load (1.0ms) SELECT "schema_migrations".* FROM "schema_migrations"
=> []
But when you run it and you check your schema_migration table, you will have the version of your migration.
> rails db:migrate
== 20230723165832 CreateUsers: migrating ==================================
-- create_table(:users)
-> 0.0137s
== 20230723165832 CreateUsers: migrated (0.0137s) =========================
> bin/rails c
irb(main):001:0> SchemaMigration.all
SchemaMigration Load (0.5ms) SELECT "schema_migrations".* FROM "schema_migrations"
=> [#<SchemaMigration:0x0000000112c4dbd8 version: "20230723165832">]
Now that we have understood that, squashing our migration is easy.
As I said Padle is nice squash is better
What is squashing? Squashing is the action of merging all your migration into only one file.
Why you would do that ? Because migrations need to load your migration code file and it can take ages if you have a load a lot of them.
But first, let's create another migration :
> rails g model company name:string
invoke active_record
create db/migrate/20230724134526_create_companies.rb
create app/models/user
This creates this file :
class CreateCompanies < Rails::Migration[7.0]
def change
create_table :companies do |t|
t.string :name
t.timestamps
end
end
end
After running your migration, if you check your schema_migrations table. You will see a new SchemaMigration object that is super cool!
irb(main):001:0> SchemaMigration.all
SchemaMigration Load (0.5ms) SELECT "schema_migrations".* FROM "schema_migrations"
=> [#<SchemaMigration:0x0000000112c4dbd8 version: "20230723165832">,#<SchemaMigration:0x0000000az2c4efd8 version: "20230724134526">]
Now with our two migrations, we can already squash them. And it is way easier than you think!
After running your migration, you end up either with a schema.rb
or schema.sql
depending on what you choose to have.
Take the content of this one and copy and paste it into the change method of your last migration in our case :
db/migrate/20230724134526_create_companies.rb
We can rename it or not depending on you, like :
db/migrate/20230724134526_squash_table.rb
class SquashTable < Rails::Migration[7.0]
def change
create_table "table1", force: :cascade do |t|
t.string "name"
end
create_table "table2", force: :cascade do |t|
t.string "name"
end
create_table "table3", force: :cascade do |t|
t.string "name"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
end
end
Then you can delete the first migration and rerun the migration!
Nothing happens, right? That's normal indeed, the schema_migrations table has already run this migration; the version of this migration has not changed even if we rename it, so it will not be rerun. Unless you drop your database and run your migration like this :
> rails db:drop db:create db:migrate
== 20230723165832 CreatePalourdes: migrating ==================================
-- create_table("table1", {:force=>:cascade})
-> 0.0031s
-- create_table("table2", {:force=>:cascade})
-> 0.0018s
-- create_table("table3", {:force=>:cascade})
-> 0.0017s
== 20230723165832 CreateTables: migrated (0.0066s) =========================
This will run as before, except that you will not be forced to load thousands of migrations, and it will run much faster locally and in your CI.
[Edit] If you are doing this on a production database, a lot of things has to be taken into account. Bot overall you have to delete that does not exists anymore :).
Conclusion
As you have seen, Rails migrations and squashing them are not so frightening.
In this article, we have better-understood Rails migration and how to squash them to improve performance.
I am sure you will agree with me on the fact that understanding Rails internals is thrilling, see you for the next article. :)
If you have any questions or tips, please do not hesitate to leave a comment :).
PS: Yes, there is a typo in this article I meant Padel, we are all humans after all :P.
Top comments (3)
TL;DR: don't do it!
By squashing the migrations you miss out all the benefits of migrations ... and it's redundant and unnecessary.
When migrating a database with
rails also runs db:schema:dump that creates the schema.rb (don't use schema.sql).
When setting up a new instance of your app, you run
that creates the database and runs only the one migration "schema.rb". So what you want to achieve is already build into rails.
The benefits of incremental migration are
development in branches
Rolling back migrations is essential when you develop with feature branches.
Think of developing in a feature branch with a new migration.
The db schema of your branch differs from trunk (main/master).
Now you have to fix something in your trunk (or merge a different feature branch or do something in another branch, ...).
Than you have to roll back the migrations of your current branch, switch the branch and do the migrations of the new branch.
multiple instances with different versions
I normally have several instances of my projects. At least three stages (development / staging / production), sometimes multiple instances (customers). Each instance may be on a different release.
To upgrade an instance, all not already migrated migration of the release / commit to upgrade to have to be run - but only these.
Already migrated migration must not be run (produce errors).
rails handles all this for you with db:migrate.
By migrating, you preserve exiting data, it's not a recreation of the database.
conclusion
Thanks for your comment ! This is a really edge case indeed and you should really be careful when you did it.
The purpose of this article was to understand better migration. :P
It seems you can do db:setup to avoid long seeding when running E2E test in your CI in your opinion without squashing migrations ?
There are many good reasons to squash migrations.
For separating data migrations, this gem is fantastic:
github.com/ilyakatz/data-migrate