Optimizing GraphQL with Dataloader.

by László Hegedüs

Throughout my professional career, I’ve had the chance to work with a few programming languages from different paradigms. Some of them were a joy to use, others, were bloated with features that could make even the simplest code unreadable. A little over two years ago I decided to give Erlang a try. Soon after, I started working at Erlang Solutions, and after a couple of months of intensive coding in Erlang I was introduced to Elixir. I’ve been working as an Elixir developer since then.

Over the last fourteen months I have been working with a client on mobility applications. We developed several parts of the backend of a corporate car sharing platform and built an application from scratch to conduct and analyze test drives at dealerships around the World. Development of the former had already started before we got there, so the tech stack was pretty much decided. We developed a handful of services exposing GraphQL APIs. For a number of reasons Elixir was the right fit for this purpose. Using the Absinthe library to craft the APIs was another good choice by their team. However, multiple services had to communicate with each other and making a responsive UI requires optimized queries whenever possible. We learned a lot from the experience which we will share for you today in this blog.

When we started working on the second application, we had more flexibility in choosing the tools. We decided to stick with a GraphQL API and used Absinthe, but paid attention to how we write the resolver functions. In the end, our API was blazing fast, because most of our resolvers ran only a handful of queries.

Designing a usable API is art, as is optimizing the backend that serves that API. In this blog post — and hopefully some upcoming ones — I’ll give you a few hints on what to avoid or what to strive for when working with Absinthe. We cannot always affect how other applications/services work, but we can do our best to make our service as fast as possible.

There are plenty of resources out there to help you get started with Absinthe, and the number of tutorials on how to use Dataloader are growing, but it is a recurring topic on Slack. I hope to give you an insight into how it works and how you can make use of it in your project.

What is Dataloader in a nutshell?

In short, Dataloader is a tool that can help your application fetch data in an optimal way by implementing batching and caching.

Dataloader in Elixir

In the Elixir world, it is provided by a library and integrates well with Ecto and Absinthe. Note that it’s not a silver bullet, your queries using Dataloader rely on the optimal implementation of the underlying library for the specific data source.

We are not going into details on how to use Dataloader on its own. Please consult the official documentation for that.

Instead, let’s dive into using Dataloader in Absinthe.

Using Dataloader in Absinthe Resolvers

For these examples, we are going to use rather simple schemas to avoid getting lost in the details. Assume we are building a database where we keep track of companies and their employees. One employee belongs to exactly one company, but each company may have multiple employees. The Ecto schemas may be defined as:

And the corresponding object definitions in Absinthe GraphQL may look like:

This is all good until we want to resolve the employees in the company. A naïve field definition of it may be:

Similarly, we can add the field :company to the employee object as:

If we now query a company through GraphQL and also ask for the employees on that field, then our backend will perform two SQL queries. This is not bad at all, but imagine a case where we have ten employee results. Moreover, each result asks for its own company field, which causes several duplicate queries to Ecto.

query($id: ID!) { company(id: $id) { id name employees { id name company { id name } } } }

This may not happen in this exact form, but it helps us imagine what happens when the same associated object has to be resolved for a list of results. We have to make several queries to Ecto for this to be answered. One query to resolve the company in the root, one query for each employee and one additional query for resolving the company of each employee. Twenty-one queries overall; we’re facing the infamous n+1 problem. This is where Dataloader comes to the rescue.

Adding Dataloader

The documentation of Absinthe is a good starting point for using Dataloader with an Ecto data source. In short, if we want to use Dataloader in our resolvers we have to do two things:

  • Add a dataloader struct to the resolution context
  • Add Absinthe.Middleware.Dataloader to the list of plugins in our schema

Let’s not forget to add the Dataloader plugin:

With these modifications, resolving the graphql query above requires only three Ecto queries. That is a significant improvement.

The query function

One of the useful features of the Dataloader.Ecto source is that we can pass a query function to it which can be used for filtering or processing parameters that are common to many fields, for example, pagination arguments.

Where we can define Repo.dataloader_query/2 to process parameters related to pagination and also leave room for extending it easily.

Note that so far we haven’t needed to write any Ecto queries, because we used the dataloader/1 helper from Absinthe.Resolution.Helpers.

We only see employees for one of the companies, while the list of employees for the other one is empty. This happens because the Ecto Dataloader tries to fetch the employees for both companies with a single query that includes an order_by on the company_id field. This works well when we don't want to add any limit or offset parameters. One workaround for this is to modify the argument list, which will force the loader to make separate queries for each company.

The query function is also useful if we want to have more flexibility, for example, filtering or ordering results. For this, we’ll have to extend our query function to process extra parameters. In general, I like to write query helpers that take a queryable object as their first parameter and return a queryable. For example (assuming we have a status field on the employee):

If we extend the query function ( Repo.dataloader_query/2) as below, we will be able to use these helpers easily:

And we can now resolve only active employees on companies if we specify query in the resolver:

Of course, other filtering is also possible, and we can also construct our queries based on the GraphQL parameters that the resolver receives in args.

Note that now :query becomes part of the key that is used in the dataloader cache, so querying the active and non-active employees of the same company in one GraphQL query might require two database queries.


So far, we have only seen examples of how to use Dataloader.Ecto. But what if we need to collect data from another service to respond to the GraphQL query? We can use Dataloader.KV to retrieve and cache data based on keys.

For this, we will need a function that receives two parameters. The first parameter is a batch key that groups together different objects. We will return to this shortly. The second parameter is a list, usually a list of objects or IDs from which the required data is to be retrieved. The function should return a map where the keys are the elements of this list, and the associated values are the corresponding retrieved data. For example, if we store the addresses of employees in a different service, we may write a loader function as follows:

This function receives a list of employees and returns a map with each employee mapped to an address. How the call to another service and the lookup are completed are simply implementation details, but for the solution to be optimal, the other service should support querying data in batches (for all IDs at once instead of one-by-one).

To use this function as a load function for a Dataloader.KV source we may change the context function in the schema as follows.

Then we can resolve the address field for each employee using the :address dataloader source:

As you can see, we did not use the batch key in our loader, which means we will handle all employees in one batch. This is usually fine. The batch key can be useful if we intend to pass on certain arguments to the other service or refine the results. Perhaps we have a user token that we intend to supply for the service to check whether the user has the necessary access rights:

Then make the load function handle the additional arguments:

Note that for each different batch key we have to make a call to the other service, so we have to be careful when specifying the arguments. For example, if we pass in a unique ID (e.g., employee.id), then we lose the advantage of batching, the function is called for each employee.

In general, constructing the batch key provides flexibility, but it can also hide important details in the code. Use with caution.

More control over dataloader

Notice how we took advantage of the fact that we can embed on_load calls to optimize fetching the results. First, we tell dataloader to load the employees, then we use those employees to load their addresses. Finally, we fetch the addresses and count how many unique ones there are.

In general, this kind of resolver is useful when we want to move data one (or more) level up the tree with or without aggregation. In one project, I used the same solution to retrieve telematics data of vehicles to be aggregated and displayed on certain trips taken with those vehicles. Both the vehicles and telematics data needed to be queried from other services.


Dataloader is a powerful tool when it comes to optimizing queries, but we have to be aware of its limitations. We saw a few simple examples to get up and running with Dataloader and Absinthe. This is only the tip of the iceberg, and I am hoping to follow up with some more advanced tricks and tips.

Go back to the blog

Originally published at https://www.erlang-solutions.com on April 1, 2020.

World-class solutions for issues of scale, reliability & performance. Passionate about Erlang & Elixir. MongooseIM & WombatOAM creators. Also RabbitMQ experts!

World-class solutions for issues of scale, reliability & performance. Passionate about Erlang & Elixir. MongooseIM & WombatOAM creators. Also RabbitMQ experts!