How to GraphQL with Ruby, Rails,Active Record, and no N+1 — Martian Chronicles

You work on a mature web application that cleanly separates backend and frontend. The server-side code, written in Ruby, is mostly responsible for translating HTTP requests into SQL statements (with the help of an ORM) through rich and well-documented API. You choose GraphQL over REST to streamline your endpoints, but your database is not happy with all the extra queries. After much searching, you find an exhaustive hands-on guide on fighting N+1 from a fellow GraphQL-ing Rubyist… Here it comes!

GraphQL can do wonders in a backend-only Rails application, giving your clients (whether a frontend framework or other API consumers) a single endpoint for fetching data in any shapes and sizes they might need.

There’s only one catch, but it’s a big one. N+1 big.

As the list of the associations to load is always determined at the runtime, it is very hard to be smart about querying the database.

You can either accept the sad reality of one query for parent record + one query for each association (hence “N+1”, even though the stricter term will be “1+N”)—or you can load all possible associations in advance with fancy SQL statements. But if you have a rich schema, and that’s the reason to switch to GraphQL in the first place, preloading can put an even bigger strain on a database than letting N+1 run amock. Luckily, there are tools in the Ruby-GraphQL world that allow us to be more selective and smarter about what we load, when, and how.

It’s always better to have an example

To not be unfounded, let’s draw up a practical example of a simple schema for a simple “Twitter clone” application. The goal here is not to be original but to be able to relate to types right away. They are Tweet, User, and Viewer. The Viewer is the user who views the feed of other user’s tweets. We created a separate type for a “current user” because it may expose properties otherwise inaccessible on “general” users.

class Types::Tweet < BaseObject field :content, String, null: false field :author, Types::User, null: false
end class Types::User < Types::BaseObject field :nickname, String, null: false
end class Types::Viewer < Types::BaseObject field :feed, [Types::Tweet], null: false def feed # In this case, FeedBuilder is a query object # that returns a Tweet relation based on passed params FeedBuilder.for(current_user) end
end class Types::Query < Types::BaseObject field :viewer, Types::Viewer, null: true, resolver_method: :current_user
end class GraphqlSchema < GraphQL::Schema query Types::Query

I have also prepared a gist that contains our whole Rails “application” in a single file. You can’t run it, but it’s functional enough to pass the included specs for comparing different optimization methods that we discuss in this article. To view the code and run the specs, you can run the following in your terminal in any temporary folder:

curl "" > demo.rb createdb nplusonedb # To create a PostgreSQL test database, requires Postgres installation rspec demo.rb # to run tests that compare different N+1-fighting techniques

This code contains an N+1 problem right away. Querying the feed that includes nicknames of tweet authors will trigger a single query for tweets tables, and N queries in users.

{ query { viewer { feed { content author { nickname } } } }

Solution #0: Load all the associations!

Let’s start by cleaning up our code and extracting feed loading to a resolver—a special class that encapsulates our database-querying logic.

class Resolvers::FeedResolver < BaseResolver type [Types::Tweet], null: false def resolve FeedBuilder.for(current_user) end
end class Types::Viewer < Types::BaseObject field :feed, resolver: Resolvers::FeedResolver

If you’re interested, here’s the definition for our FeedBuilder module that abstracts out some Active Record calls:

module FeedBuilder module_function def for(user) Tweet.where(author: user.followed_users) .order(created_at: :desc) .limit(10) end

Extracting logic to a resolver allows us to create alternative resolvers and hot-swap them to compare results. Here’s a resolver that solves the N+1 problem by preloading all associations:

class Resolvers::FeedResolverPreload < Resolvers::BaseResolver type [Types::Tweet], null: false def resolve FeedBuilder.for(current_user).includes(:author) # Use AR eager loading magic end

This solution is most obvious, but not ideal: we will make an extra SQL query to preload users no matter what, even if we request just the tweets and don’t care about their authors (I know, it’s hard to imagine, but let’s say it’s for the anonymized data-mining operation).

Also, we have to define a list of associations on the top level (in Query type or inside resolvers that belong to it). It’s easy to forget to add a new association to the list when a new nested field appears deep inside the graph.

However, this approach is helpful when you know that client does ask for the author data most of the time (for instance, when you control the frontend code).

GraphQL execution engine is a tool responsible for processing a query and preparing a response. Read more in our 3-part “GraphQL on Rails” tutorial.

While resolving a query, the GraphQL’s execution engine knows which data was requested, so it’s possible to find out what should be loaded at the runtime. The graphql-ruby gem comes with a handy Lookahead feature that can tell us in advance if a specific field was requested. Let’s try it out in a separate resolver:

class Resolvers::FeedResolverLookahead < Resolvers::BaseResolver type [Types::Tweet], null: false extras [:lookahead] def resolve(lookahead:) FeedBuilder.for(current_user) .merge(relation_with_includes(lookahead)) end private def relation_with_includes(lookahead) # .selects?(:author) returns true when author field is requested return Tweet.all unless lookahead.selects?(:author) Tweet.includes(:author) end

In this case, we make the query in the users table only when the client asks for the author field. This approach works fine only in case associations are minimal and not nested. If we take a more complex data model where users have avatars and tweets have likes, then our resolver can get out of hand real quick:

class Resolvers::FeedResolverLookahead < Resolvers::BaseResolver type [Types::Tweet], null: false extras [:lookahead] def resolve(lookahead:) scope = Tweet.where(user: User.followed_by(current_user)) .order(created_at: :desc) .limit(10) scope = with_author(scope, lookahead) if lookahead.selects?(:author) scope = with_liked_by(scope, lookahead) if lookahead.selects?(:liked_by) scope end private def with_author(scope, lookahead) if lookahead.selection(:author).selects?(:avatar) scope.includes(user: :avatar_attachment) else scope.includes(:user) end end def with_liked_by(scope, lookahead) if lookahead.selection(:liked_by).selects?(:user) if lookahead.selection(:liked_by).selection(:user).selects?(:avatar) scope.includes(likes: { user: :avatar_attachment }) else scope.includes(likes: :user) end else scope.includes(:likes) end end

You’re right, that’s not elegant at all! What if there was a way to load associations only when they are accessed? Lazy preloading can help us!

Solution #2: Lazy preloading (by Evil Martians)

With some help from my Evil Martian colleagues, I’ve written a little gem called ar_lazy_preload that lets us fall back to the preloading solution but makes it smarter without any additional effort. It makes a single request to fetch all associated objects only after the association was accessed for the first time. Of course, it works outside of GraphQL examples too and can be really handy in REST APIs or while building server-rendered views. All you need is to add gem "ar_lazy_preload" to your Gemfile, bundle install, and then you’ll be able to write your resolver like so:

class Resolvers::FeedResolverLazyPreload < Resolvers::BaseResolver type [Types::Tweet], null: false def resolve FeedBuilder.for(current_user).lazy_preload(:author) end

The gem is created with laziness in mind, so if you feel lazy even to type .lazy_preload all the time, you can enable it globally for all Active Record calls by adding a line of configuration:

ArLazyPreload.config.auto_preload = true

However, this approach has some downsides:

  • we finally brought the first external dependency;
  • we do not have much control over queries that are made and it will be hard to customize them;
  • if lazy preloading is not turned on, we still have to list all possible associations at the top level;
  • if one table is referenced from two places, we will make twice the database requests.

What else can we do?

Solution #3: graphql-ruby lazy resolvers

The graphql-ruby gem that makes GraphQL possible in our Ruby apps comes bundles with a way to use lazy execution:

  • instead of returning data, you can return a special lazy object (this object should remember the data it replaced);
  • when a lazy value is returned from a resolver, the execution engine stops further processing of the current subtree;
  • when all non–lazy values are resolved, the execution engine asks the lazy object to resolve;
  • lazy object loads the data it needs to resolve and returns it for each lazy field.

It takes some time to wrap your head around this, so let’s implement a lazy resolver step by step. First of all, we can reuse the initial FeedResolver that is not aware of associations:

class Resolvers::FeedResolver < Resolvers::BaseResolver type [Types::Tweet], null: false def resolve FeedBuilder.for(current_user) end

Then, we should return a lazy object from our Tweet type. We need to pass the ID of the user and a query context because we will use it to store a list of IDs to load:

class Types::Tweet < Types::BaseObject field :content, String, null: false field :author, Types::User, null: false def author, object.user_id) end

Each time a new object is initialized, we add a pending user ID to the query context, and, when #user is called for the first time, we make a single database request to get all the users we need. After that, we can fill user data for all lazy fields. Here is how we can implement it:

class Resolvers::LazyUserResolver def initialize(context, user_id) @user_id = user_id @lazy_state = context[:lazy_user_resolver] ||= { user_ids:, users_cache: nil } @lazy_state[:user_ids] << user_id end def user users_cache[@user_id] end private def users_cache @lazy_state[:users_cache] ||= begin user_ids = @lazy_state[:user_ids].to_a @lazy_state[:user_ids].clear User.where(id: user_ids).index_by(&:id) end end

Wondering how the execution engine can tell the difference between regular and lazy objects? We should define lazy resolver in the schema:

class GraphqlSchema < GraphQL::Schema lazy_resolve(Resolvers::LazyUserResolver, :user) query Types::Query

It tells the execution engine to stop resolving users when the Resolvers::LazyUserResolver object is returned and only come back to it after all the other, non-lazy fields are resolved.

That works, but it’s quite a bit of boilerplate code that you might have to repeat often. Plus, the code can become quite convoluted when our lazy resolvers need to resolve other lazy objects. Fortunately, there exists a less verbose alternative.

Solution #4: Batch loading

The gem graphql-batch from Shopify uses the same lazy mechanism of graphql-ruby but hides the ugly boilerplate part. All we need to do is inherit from GraphQL::Batch::Loader and implement the perform method:

class RecordLoader < GraphQL::Batch::Loader def initialize(model) @model = model end def perform(ids) @model.where(id: ids).each { |record| fulfill(, record) } ids.each { |id| fulfill(id, nil) unless fulfilled?(id) } end

This loader (taken from the examples directory in the official repo) expects a model class in the initializer (to decide where the data should be loaded from). #perform method is responsible for fetching data, #fulfill method is used to associate a key with the loaded data.

Batch loader usage is similar to the lazy version. We pass User to the initializer and ID of the user to load lazily (this ID will be used as a key to fetch the associated user):

class Types::Tweet < BaseObject field :content, String, null: false field :author, Types::User, null: false def author RecordLoader.for(::User).load(object.author_id) end

As usual, we need to turn on lazy loading in our schema:

class GraphqlSchema < GraphQL::Schema query Types::Query use GraphQL::Batch

How does this work? When use GraphQL::Batch is added to the schema, Promise#sync method is registered to resolve lazily (it uses Promise.rb under the hood). When #load method is called on a class that inherits from GraphQL::Batch::Loader, it returns a Promise object—that is why the execution engine treats it as a lazy value.

This approach has a useful side–effect—you can chain loading in the following way:

def product_image(id:) RecordLoader.for(Product).load(id).then do |product| RecordLoader.for(Image).load(product.image_id) end

But even with all the advanced techniques we described above, it is still possible to end up with N+1. Imagine that we are adding an admin panel where you can see a list of users. When a user is selected, a user profile pops up, and you can see a list of their followers. In GraphQL world, where data should be accessed from the place it belongs to, we could do something like this:

class Types::User < BaseObject field :nickname, String, null: false field :followers, [User], null: false do argument :limit, Integer, required: true, default_value: 2 argument :cursor, Integer, required: false end def followers(limit:, cursor: nil) scope = object.followers.order(id: :desc).limit(limit) scope = scope.where("id < cursor", cursor) if cursor scope end
end class Types::Query < BaseObject field :users, [User], null: false field :user, User, null: true do argument :user_id, ID, required: true end def users ::User.all end def user(user_id:) ::User.find(user_id) end

The list of users can be fetched using the following query:

query GetUsers($limit: Int) { users(limit: $limit) { nickname }

A list of users who follow a specific user can be loaded like so:

query GetUser($userId: ID, $followersLimit: Int, $followersCursor: ID) { user(userId: $userId) { followers(limit: $limit, cursor: $followersCursor) { nickname } }

The problem appears when someone tries to load a list of users with their followers in the same query:

query GetUsersWithFollowers( $limit: Int $followersLimit: Int $followersCursor: ID
) { users(limit: $limit) { nickname followers(limit: $limit, cursor: $followersCursor) { nickname } }

In this case, we cannot get rid of N+1 at all: we have to make a database call for each user because of cursor pagination. To handle such a case, we could to use the less elegant solution and move pagination to the top level:

class Types::Query < BaseObject field :users, [User], null: false field :user, User, null: true do argument :user_id, ID, required: true end field :user_followers, [User], null: false do argument :limit, Integer, required: true, default_value: 2 argument :cursor, Integer, required: false end def users ::User.all end def user(user_id:) ::User.find(user_id) end def user_followers(user_id:, limit:, cursor: nil) scope = UserConnection.where(user_id: user_id).order(user_id: :desc).limit(limit) scope = scope.where("user_id < cursor", cursor) if cursor scope end

This design still makes it possible to load users and their followers, but it turns out that we move from N+1 on the server side to N+1 HTTP requests. The solution looks fine, but hey, we love GraphQL for its logical schema structure! We want to fetch followers from the User type!

No problem. We can to restrict fetching the followers field when multiple users are requested. Let’s return an error when it happens:

class Types::Query < BaseObject field :users, [User], null: false, extras: [:lookahead] field :user, User, null: true do argument :user_id, ID, required: true end def users(lookahead:) if lookahead.selects?(:followers) raise GraphQL::ExecutionError, "followers can be accessed in singular association only" end ::User.all end def user(user_id:) ::User.find(user_id) end

With this schema, it’s still possible to fetch followers of a singular user, and we have completely prevented the unwanted scenario. Don’t forget to mention it in the docs!

That’s it! You’ve made it to the end of our little guide, and now you have at least six different approaches to try out in your Ruby-GrapgQL code to make your application N+1 free.

Don’t forget to check out other articles on GraphQL and N+1 problem in our blog: from the beginner-friendly code-along tutorial on building a Rails GraphQL application with React frontend in three parts (start here) to the more specific use-cases of using GraphQL with Active Storage Direct Upload, dealing with persisted queries coming from Apollo, and reporting non-nullable violations in graphql-ruby.

We also have a couple of gems to make dealing with N+1 easier in “classic” Rails apps and a couple of articles to go along with them: Squash N+1 queries early with n_plus_one_control test matchers for Ruby and Rails and Fighting the Hydra of N+1 queries.

Over the past few years, our team has invested a lot of effort, including building open source, for making GraphQL a first-class citizen in Rails applications. If you think of introducing a GraphQL API in your Ruby backend—feel free to give us a shout.