At Tymate, the question of using GraphQL for our APIs has been around for a long time. Since most of our desktop applications are built with React, it was obviously a recurring request from our front-end developers.
My name is Julien, I'm 32 and I've been a Ruby developer for a little more than a year, after a professional retraining. This article is a free transcription of an internal workshop at Tymate during which I presented my feedback on a first use of the GraphQL Ruby gem.
The following assumes basic knowledge of how GraphQL works.
Data structure
The first thing to know when you start working with GraphQL is that everything is typed. Every piece of data you use or want to serialize must be clearly defined. It's not something you're naturally used to with Ruby, but it quickly becomes very comfortable.
Types
So we will naturally start by defining types.
First, we will create types for the models of our API that we will need for our GraphQL queries. The definition of GraphQL types is very similar to the logic found in serializers on our classic REST APIs: the definition of the data that we will be able to consult and the form in which this data is made available. But this obviously goes further by integrating typing, validations, possibly rights management...
Below is an example of a GraphQL type corresponding to a model of our API :
module Types
class ProviderType < Types::BaseObject
# These lines allow you to integrate the specifics of Relay to the type
# we'll come back to this later in the article
implements GraphQL::Relay::Node.interface
global_id_field :id
# Here, we define the fields that we will be able to request on our object
# it can be basic fields, methods from the model
# or methods directly defined in the type.
# For each field, we must specify the type
# and the possibility of being null.
field :display_name, String, null: true
field :phone_number, String, null: true
field :siret, String, null: true
field :email, String, null: true
field :iban, String, null: true
field :bic, String, null: true
field :address, Types::AddressType, null: true
field :created_at, GraphQL::Types::ISO8601DateTime, null: true
field :updated_at, GraphQL::Types::ISO8601DateTime, null: true
# Here, we choose to rewrite the model's address relationship.
# The simple definition of the field address is enough
# to return the correct items.
# But in doing so, we can integrate the logic of
# the BatchLoader gem to eradicate the N + 1 from our request.
def address
BatchLoader::GraphQL.for(object).batch do |providers_ids, loader|
Address.where(
addressable_type: 'Provider', addressable_id: providers_ids
).each { |address| loader.call(object, address) }
end
end
end
end
Queries
Once we have defined the types we need, they certainly appear directly in the documentation automatically generated by GraphiQL, but they cannot be accessed by API consumers.
For this, it is necessary to create queries. This is the equivalent of a GET query in REST. The consumer will ask to consult either a specific object, or a collection of objects (which we will call a connection in the Relay methodology, I will come back to this later).
Display an isolated object
Queries are defined in the QueryType
class:
module Types
class QueryType < Types::BaseObject
# Here, the field is the name of the query.
# The arguments passed in the block will allow
# to find the object we're looking for
field :item, Types::ItemType, null: false do
argument :id, ID, required: true
end
# As opposed to the definition of model types seen above,
# For Queries the field doesn't stand on its own
# It is necessary to define in a method of the same name
# the logic that will solve the query
def item(id:)
Item.find(id)
end
end
end
In its behavior, a query is very close to what you do in a REST controller. We define the query in a field, to which we associate the type of the object(s) we are going to return, and arguments that will allow us to retrieve the desired object, for example an ID.
In the item method, we retrieve the arguments passed in the field, and we look for the corresponding object.
Lighten Queries
As soon as you start having a lot of different models that you want to be able to display in a query, you're going to be faced with a lot of code duplication. A bit of metaprogramming can lighten our query_type.rb
:
module Types
class QueryType < Types::BaseObject
# First we define a common method that finds an object
# from a unique ID provided by Relay
def item(id:)
ApiSchema.object_from_id(id, context)
end
# And we create the fields that call the above method for all types
# for which we need a query.
%i[
something
something_else
attachment
user
identity
drink
food
provider
].each do |method|
field method,
"Types::#{method.to_s.camelize}Type".constantize,
null: false do
argument :id, ID, required: true
end
alias_method method, :item
end
end
end
Display a collection of objects
By default, there is not much difference between a query that returns an object and one that returns a collection of objects. We will simply have to specify that the query result will be an array by passing the type itself between []
:
module Types
class QueryType < Types::BaseObject
field :items, [Types::ItemType], null: false do
# you no longer search for an item by its ID, but by its container.
argument :category_id, ID, required: true
end
def item(category_id:)
Category.find(category_id).items
end
end
end
We call the query like this:
query items {
items {
id
displayName
}
}
And we get the following json
in response:
[
{
"id": "1",
"displayName": "toto"
},
{
"id": "2",
"displayName": "tata"
}
]
Relay & connections
It could work for very basic needs, but it will quickly become limited.
For better structured requests, Facebook has created Relay, a GraphQL overlay (natively managed by gem) that introduces two very practical paradigms:
- we work with global IDs (strings created from a base64 encoding of
["ModelName, ID"]
) - a very specific nomenclature to organize and consume collections of objects: the
connections
.
edit: I wrote the initial workshop on the basis of version v1.9 of the graphql-ruby gem. The notion of connection has been extracted from Relay to become the default formatting of the collections in v1.10.
The global ID instead of the classical IDs is there first and foremost for JS applications that will consume the API. This allows to always use this ID as a key in object loops. From an API point of view, working with unique IDs regardless of the type of object is also very practical.
Nomenclature
Concerning the connections, here is what our previous query adapted in this format looks like:
{
items(first: 2) {
totalCount
pageInfo {
hasNextPage
hasPreviousPage
endCursor
startCursor
}
edges {
cursor
node {
id
displayName
}
}
}
}
And the corresponding response:
{
"data": {
"items": {
"totalCount": 351,
"pageInfo": {
"hasNextPage": true,
"hasPreviousPage": false,
"endCursor": "Mw",
"startCursor": "MQ"
},
"edges": [
{
"cursor": "MQ",
"node": {
"id": "UGxhY2UtMzUy",
"displayName": "Redbeard"
}
},
{
"cursor": "Mg",
"node": {
"id": "UGxhY2UtMzUx",
"displayName": "Frey of Riverrun"
}
},
{
"cursor": "Mw",
"node": {
"id": "QmlsbGVyLTI=",
"displayName": "Something Else"
}
}
]
}
}
}
The connection
gives us access to additional information to our collection. By default, the gem generates the pageInfo
which is used for cursor pagination, but we can also write custom fields like here the totalCount
added to manage a more traditional numbered pagination.
The edges
are intended to contain information related to the relationship between the object and its collection. By default, it will contain the cursor, which represents the position of the node it contains within the collection. But it is possible to define its own custom fields.
The nodes
are basically the objects of the collection.
Integration
Relay brings valuable data to queries, but it requires more verbosity in their definition. Concretely, instead of just one query, we will have to define 3 types:
- the
query
- the
connection
- the
edge
ConnectionType
The connection type definition includes 3 components:
- the specification of the EdgeType to be used
- the parameters that can be applied to this connection
- custom fields that can be requested in return
class Types::Connections::ProvidersConnectionType < Types::BaseConnection
# We start by calling the class of the EdgeType we want to associate
# with the connection
edge_type(Types::Edges::ProviderEdge)
# By defining Input types here, we may then call them in the queries
# associated with this connection.
# I will come back to the `eq/ne/in` arguments below.
class Types::ProviderApprovedInput < ::Types::BaseInputObject
argument :eq, Boolean, required: false
argument :ne, Boolean, required: false
argument :in, [Boolean], required: false
end
class Types::ProviderFilterInput < ::Types::BaseInputObject
argument :approved, Types::ProviderApprovedInput, required: false
end
# Missing by default, we can incorporate a counter of the number of objects
# in the collection by creating a field and resolving it
# in the ConnectionType
field :total_count, Integer, null: false
def total_count
skipped = object.arguments[:skip] || 0
object.nodes.size + skipped
end
end
EdgeType
The edge
can be thought of as a linking table, which acts as a bridge between the connection
and the nodes
it contains. By default, we need to set the node_type
to identify the type of object returned by our connection. But it is also possible to define custom methods.
I haven't yet encountered a use case for this feature, however.
class Types::Edges::ProviderEdge < GraphQL::Types::Relay::BaseEdge
node_type(Types::ProviderType)
end
QueryType
Lastly, once the connection is well defined, it must be called:
- either in a specific query
- either in the type of a parent object
To do this, instead of associating the field with the type of the returned final object, we associate it with the type of the connection.
module Types
class QueryType < Types::BaseObject
field :providers, Types::Connections::ProvidersConnectionType, null: true
def providers
Provider.all
end
end
end
module Types
class ParentType < Types::BaseObject
# ...
global_id_field :id
field :display_name, String, null: true
# ...
field :providers, Types::Connections::ProvidersConnectionType, null: true
def providers
BatchLoader.for(object.id).batch(default_value: []) do |ids, loader|
Provider.where(parent_id: ids).each do |rk|
loader.call(rk.parent_id) { |i| i << rk }
end
end
end
end
end
By default, you can pass all the basic paging arguments to the query (first, after, before, last...). If necessary, we can specify additional arguments to specify our query:
module Types
class QueryType < Types::BaseObject
field :providers, Types::Connections::ProvidersConnectionType, null: true do
# filter calls ProvidersConnectionType specific InputTypes
# which we defined above
argument :filter, Types::ProviderFilterInput, required: false
end
def providers
Provider.all
end
end
end
Queries extraction
All the queries we want to expose on our API must be defined in the query_type.rb
file shown just before. But with the increased complexity, the file will be quickly overloaded. Then it is obviously possible to extract the queries logic in other files, resolvers.
The query_type.rb
file will look like this:
module Types
class QueryType < Types::BaseObject
# ...
field :all_providers_connection, resolver: Queries::AllProvidersConnection
# ...
end
The query logic will be in a separate file:
module Queries
class AllProvidersConnection < Queries::BaseQuery
description 'list all providers'
type Types::Connections::ProvidersConnectionType, null: false
argument :filter, Types::ProviderFilterInput, required: false
argument :search, String, required: false
argument :skip, Int, required: false
argument :order, Types::Order, required: false
def resolve(**args)
res = connection_with_arguments(Provider.all, args)
res = apply_filter(res, args[:filter])
res
end
end
end
The custom methods connection_with_arguments
and apply_filter
are defined in BaseQuery
.
Sorting and filtering connections
connection_with_arguments
allows me to integrate sorting arguments and numbered paging into my queries.
def connection_with_arguments(res, **args)
order = args[:order] || { by: :id, direction: :desc }
res = res.filter(args[:search]) if args[:search]
res = res.offset(args[:skip]) if args[:skip]
# the specification of the table name is necessary to enable
# sorting when the initial SQL query contains joins
res = res.order(
"#{res.model.table_name}.#{order[:by]} #{order[:direction]}"
)
res
end
apply_filter
provides global logic to all filter
arguments in the API.
Usually, in our REST APIs, we use scopes to allow users to filter the results of a query. But these filters are still pretty basic. In designing this first GraphQL API, working with my fellow React developer and API consumer, we wanted to go a little further. GraphQL makes it possible to choose precisely the data that one wishes to receive, so we might as well give also the possibility to filter them precisely.
So we looked for an existing formatting structure for the filters and decided to use the standard proposed in the GatsbyJS documentation as a starting point.
Complete list of possible operators
In the playground below the list, there is an example query with a description of what the query does for each operator.
eq
: short forequal
, must match the given data exactlyne
: short fornot equal
, must be different from the given dataregex
: short forregular expression
, must match the given pattern. Note that backslashes need to be escaped twice, so/\w+/
needs to be written as"/\\\\w+/"
.glob
: short forglobal
, allows to use wildcard * which acts as a placeholder for any non-empty stringin
: short forin array
, must be an element of the arraynin
: short fornot in array
, must NOT be an element of the arraygt
: short forgreater than
, must be greater than given valuegte
: short forgreater than or equal
, must be greater than or equal to given valuelt
: short forless than
, must be less than given valuelte
: short forless than or equal
, must be less than or equal to given valueelemMatch
: short forelement match
, this indicates that the field you are filtering will return an array of elements, on which you can apply a filter using the previous operators
The current integration looks like this:
def apply_filter(scope, filters)
return scope unless filters
filters.keys.each do |filter_key|
filters[filter_key].keys.each do |arg_key|
value = filters[filter_key][arg_key]
# Here we translate the global IDs we get from the API consumer
# in classic IDs recognized by Postgresql
if filter_key.match?(/([a-z])_id/)
value = [filters[filter_key][arg_key]].flatten.map do |id|
GraphQL::Schema::UniqueWithinType.decode(id)[1].to_i
end
end
scope = case arg_key
when :eq, :in
scope.where(filter_key => value)
when :ne, :nin
scope.where.not(filter_key => value)
when :start_with
scope.where("#{filter_key} ILIKE ?", "#{value}%")
else
scope
end
end
end
scope
end
It's still a work in progress, the code is rudimentary, but it's enough to filter the first queries.
Pagination
On this first application, the screens were still designed with a numbered pagination. So I had to integrate default fields to all connections:
class Types::BaseConnection < GraphQL::Types::Relay::BaseConnection
field :total_count, Integer, null: false
field :total_pages, Integer, null: false
field :current_page, Integer, null: false
def total_count
return @total_count if @total_count.present?
skipped = object.arguments[:skip] || 0
@total_count = object.nodes.size + skipped
end
def total_pages
page_size = object.first if object.respond_to?(:first)
page_size ||= object.max_page_size
total_count / page_size + 1
end
def current_page
page_size = object.first if object.respond_to?(:first)
page_size ||= object.max_page_size
skipped = object.arguments[:skip] || 0
(skipped / page_size) + 1
end
end
It works, but it is regrettable because GraphQL is really designed for cursor pagination, so we are forced to reinvent the wheel in several places while we have a turnkey and certainly more powerful operation at our disposal.
We would certainly gain a lot in performance by pushing pagination by cursor everywhere where we could do without the total number of pages, the page number and the number of items in the collection.
So that's good?
Our first feature using GraphQL will go into production next week, so it's still a little early to make a full assessment of the use of this API language.
Nevertheless, the benefits for our front-end developers are immediate, as both React and GraphQL are designed to be used together.
On the other hand, if this first project is only consumed by a desktop application, we will soon have to ask ourselves the question of GraphQL libraries for mobile languages, especially on Flutter, test them and hope that they offer the same ease of use.
As for the back-end, I enjoyed working on fully typed classes. Although it's a bit verbose, it imposes a certain rigor which, once the reflex is acquired, becomes particularly comfortable (fewer surprise bugs, when it breaks we know more quickly why).
Note: This motivated me to take a renewed interest in Sorbet, perhaps the subject of a future article?
Of course, you have to relearn the whole way you build your API services, but I think it's for the best. Without even considering switching from REST to GraphQL, knowing and testing each other's paradigms can only improve the way we work.
The question of performance will remain, in particular the fight against N+1 requests... To be continued!
Top comments (4)
We've been using GraphQL Ruby Pro in production for a couple years now. A tool that we've used to help prevent N+1 queries is JIT Preloader github.com/clio/jit_preloader
You can use it in a field resolver like so:
JIT Preload will then figure out which associations to load down the query resolver chain. For more details I would check out their docs. It has been pretty handy and a great way for us to address the N+1 query problem with ActiveRecord.
It's also handy because it can be applied outside the context of GraphQL as well in somewhere like the service layer.
Thanks Bryan for the advice! JIT Preloader sounds really nice. I've browsed through the documentation and it seems much less verbose than BatchLoader which I'm using at the moment anyway. I'm going to start trying it out in preparation for the next version of our GraphQL API. If it answers my concerns, it should be much more convenient to maintain.
Hi 🙂
We are starting to explore GraphQL too and I found your post looking for tips to implement numbered pagination. (we work in the same building 🙃 well not these days obviously)
Thank you for this useful post, I will share it to my colleagues.
For numbered pagination, I ended up using this gist from the graphql gem author: gist.github.com/rmosolgo/da1dd95c2...
It implements numbered pagination with custom page types, like connection types but independently.
Hi! How well did this scale after almost one year? I recently joined a project where this gem is implemented, and there are lots of complains about performance. We can fetch no more than 50 records at the same time, which could be a humble number for some listings, and even then my best avg response times are around 500/600ms with a query complexity of 65 (measured using GraphQL::Analysis::AST::QueryComplexity). If the page size is increased to 100 records, avg response times would scale up to around 1300ms which is ridiculous for that amount of records.
Of course I've tried everything I could find online plus my own findings. After trying 3 different batch loaders I even replaced Postgres queries in favor of ElasticSearch to remove query processing overhead, and my results did get better, but still not great.
So could you please share your thoughts about performance? Thanks so much!
My stack: Rails 5.2, Ruby 2.7.2, hosted on EC2 c4.xlarge instances