If you do not know roadmap.sh, you should totally give it a look. it provides a high-level view of what you should learn if you want to increase your skills and knowledge in IT.
One of the knowledge you should have as a backend engineer is N+1 queries. It is the subject of this article without any surprise.
What it is how they can happen in a Graphql API in your Rails app ? How to prevent them and in particular in graphql and ruby ? We will try to see answer these in the following article. :)
Oh yes and if you like to learn or read about Rails, Ruby, databases and a lot of tech related stuff :
Keep in Touch
On Twitter : @yet_anotherDev
On Linkedin : Lucas Barret
N+1 Queries
Before going to the how
, let's understand the what
of N+1 queries.
This is not as complicated as it sounds:
Let's say you are developing a SaaS product; your customers are companies; these companies have users.
You have to write a query to get all the companies and their users.
With graphql-ruby, your code will look approximately like this:
app/graphql/types/user_type.rb
module Types
class UserType < Types::BaseObject
field :name, String, null: false
end
end
##app/graphql/types/company_type.rb
module Types
class CompanyType < Types::BaseObject
field :id, ID, null: false
field :name, String, null: false
field :user, [Types::UserType], null: true
def user
User.where(company_id: object.id)
end
end
end
##app/graphql/queries/company_query.rb
module Queries
class CompaniesQuery < BaseQuery
type [Types::CompanyType], null: true
def resolve(id)
Company.all
end
end
end
We have defined a Query that gets all the companies. This query returns an Array of Company, materialized by [Types::CompanyType]. In this CompanyType, we retrieve all the user materialized [Types::UserType].
To test it, you can run a Rspec test like this :
Company Load (0.6ms) SELECT "companies.*" FROM "companies"
↳ app/controllers/graphql_controller.rb:15:in `execute'
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."company_id" = $1 [["company_id", 1]]
↳ app/controllers/graphql_controller.rb:15:in `execute'
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."company_id" = $1 [["company_id", 2]]
↳ app/controllers/graphql_controller.rb:15:in `execute'
Completed 200 OK in 49ms (Views: 0.4ms | ActiveRecord: 19.1ms | Allocations: 8978)
What should trigger your attention is that Companies are queried once, then Users are queried twice. You are doing 2 SQL queries to load the two companies' users.
If you had three users, you would trigger 3 SQL queries for the company' users. And so on for each of your company, 10 companies with 10 users make 100 queries !
We call these N+1 queries, which can happen in graphql and with a REST API.
We can easily understand that this query needs to be more scalable, because it could put too much pressure on our database.
Batch Loading with GraphQL-Batch gem
The graphql-batch
gem will enable you to solve this issue even more. First, avoid N+1 by batching and grouping all the same entity's queries into one SQL query. Two allows you to do lazy-loading on your fields, so they will not be queried if not used in a particular query.
In the next part of this article, we will see how to create a Loader for our data to avoid this issue.
It will be specialized to make it easier to understand, but you can create one generic to use with all your model.
So as we said before, we want to gather the list of users from companies.
Let's use something similar to the example in the graphql-ruby repository. If I copy and paste the example of the lib, I will end up with something like this after adapting a bit.
class RecordLoader < GraphQL::Batch::Loader
def initialize(model)
@model = model
end
def perform(ids)
@model.where(company_id: ids).each { |record| fulfill(record.company_id, record) }
ids.each { |id| fulfill(id, nil) unless fulfilled?(id) }
end
end
This will not work; what will happen differs from what you expect. You will end up with an error!
This is due to the way we fulfill the promise. It is associating an id with the record in a local cache.
@model.where(company_id: ids).each { |record| fulfill(record.company_id, record) }
In other words: we have associated the first user to company_id 1, the second to company_id 2, and so on. To counter that, you must group_by
and write something like this.
class ArrayRecordLoader < GraphQL::Batch::Loader
def initialize(model)
@model = model
end
def perform(ids)
@model.where(company_id: ids).group_by(&:company_id).each { |key,record| fulfill(key, record) }
ids.each { |id| fulfill(id, nil) unless fulfilled?(id) }
end
end
This works very well; executing it will give you what you expect.
You have a promise per id key you give. In this case, it is the company_id, and you cannot fulfill a promise that has been fulfilled. So, in this case, with the following code :
@model.where(company_id: ids).each { |record| fulfill(record.company_id, record) }
You will fulfill the promise for the first company_id with the first User retrieved. And this is not an array, so it will not work, so the expected type will not be respected, and graphql will tell you that it is not the suitable type.
Conclusion
Through this article, we have seen N+1 queries and that they can happen through any API, GraphQL, or REST SOAP when calling your database.
We have seen how to use graphql-batch and implement a basic Loader. To avoid N+1 and using Lazy loading to put less pressure onto your database is something you want to know.
Top comments (0)