DEV Community

Cover image for Caching with Elixir and ETS
Elvio Vicosa for AppSignal

Posted on • Originally published at blog.appsignal.com

Caching with Elixir and ETS

In this post, you'll learn how to use ETS as a caching mechanism in your Elixir applications, get familiar with different available options, and be made aware of some things to keep in mind.

The Concept of Caching

My two-year-old son loves eating cookies. He loves them so much that even while playing soccer, he keeps going back to the jar to grab one more.

To stop playing so as to get another cookie is not something he enjoys, so he started grabbing multiple cookies and keeping them in his hands.

That concept of keeping things that are important, but are costly to get, close to the subject that needs to use them, is commonly referred to as caching.

In computing, caching is a ubiquitous concept that is used in both hardware and software. It has been around for a long time and is a smart concept, ported from real-life situations.

How Popular Languages Deal with Caching

Popular programming languages usually rely on external dependencies such as Memcache or Redis for caching. Using these tools is almost a standard practice now.

Although that's a valid approach, it introduces yet another dependency to a project. As simple as it initially seems, the operational costs of keeping that dependency alive (running, monitoring, patching with the latest security updates, etc.), might be overwhelming when the requirements are rather simple.

Cache in Elixir (and Erlang)

In Elixir, the need for caching information is common as a way of avoiding the unnecessary hurdle of accessing information that has a high-cost tag attached to it.

Elixir comes with a huge advantage that might help simplify the lives of developers in need of caching—ETS.

ETS stands for Erlang Term Storage. It enables developers to store and access data using a key.

Here's a small example of the usage of an ETS table:

iex(4)> :ets.new(:security_level, [:named_table])
:security_level

iex(5)> :ets.insert(:security_level, {1, :high})
true

iex(6)> :ets.insert(:security_level, {2, :low})
true

iex(7)> :ets.insert(:security_level, {3, :none})
true

iex(8)> :ets.lookup(:security_level, 1)
[{1, :high}]
Enter fullscreen mode Exit fullscreen mode

In that example, an ETS table named :security_level is created, and a couple of values are inserted into it.

ETS tables have four different types:

  • Set: That's the default type—the one used in the above example. Each key can occur only once.

  • Ordered set: An ordered set has the same property as the set, but ordered by Erlang/Elixir term.

  • Bag: An ETS using the "bag" type supports multiple items per key.

  • Duplicate bag: A "Duplicate bag" allows both duplicated keys and items.

The best ETS type to use depends on the specific needs of an application.

The data structures used to implement the ETS tables in the Erlang VM are optimized to provide the best possible access time. Depending on the type, the Erlang VM uses either Hash Tables or Binary Trees to represent the ETS table. When compared to linear access times for a list, both cases have better performance.

Here are two examples where I recently used ETS for caching purposes:

Feature Flags

We use continuous delivery at the company I work in, and feature flags are crucial if you're doing trunk-based-development and want to integrate new, albeit not finished code, without enabling it for customers.

We store our feature flags in AWS Parameter Store, fetching them once every 5 minutes and caching them locally in an ETS table.

Soft Real-Time Stats

There are several cases where we need to display things like "X people have purchased that plan" or "That product was already used by Y customers".

A naive approach would be to perform database queries every time such values are required, but that would be extremely heavy (and unnecessary) on our database.

We cache those values and update them every couple of minutes. Although they might not reflect the real number, they indicate a "good enough" number, which is meaningful to customers.

Things to Keep in Mind When Using ETS as a Cache Solution

Integration of Unnecessary Dependencies

ETS is part of the Erlang VM, so there are no additional dependencies necessary when using it.

Optimized Lookup

The lookup of an item (or group of items) is optimized depending on the type of the ETS table.

No Serialization

ETS tables can store Elixir data structures. This means you can store a group of %Accounts.User{...} or %PlayerScore{} and you'll be able to fetch those structures back without having to serialize them.

Not Garbage Collected

When an ETS table is no longer necessary, you must delete it manually using :ets.delete, otherwise, it might stay around forever.

A common solution is to create a table inside a process (e.g. GenServer). Since the ETS table is linked to the process that created it, if the process dies (or is terminated), the table also gets "garbage collected".

Use of ETS in Distributed Systems

ETS plays nice when using distributed Elixir, but in both the examples I shared, the projects didn't connect multiple nodes in a distributed Elixir fashion. Nevertheless, those features are part of a project deployed to several web servers, each one with its copy of ETS locally.

In both examples—"Feature Flags" and "Soft real-time stats"—ETS works well because Eventual Consistency is an acceptable side-effect.

How to Create and Use ETS Within a GenServer

Previously, I mentioned a case where I used ETS for "Feature Flags". In this section, I want to give you a better understanding of its internals.

The non-cached code is quite simple:

defmodule FeatureFlags do
  def enabled?(name) do
    name in get_list_of_enabled_flags()
  end

  defp get_list_of_enabled_flags() do
    # ... Access AWS, fetch values for specific project/env
  end
end
Enter fullscreen mode Exit fullscreen mode

FeatureFlags.enabled?/1 returns either true if the feature flag is enabled, or false when it is not enabled.

The problem with this code lies in the fact that for every call to FeatureFlags.enabled?/1, the code performs requests to AWS and parses the response.

Since the changes to the features happen once or twice a day, we can safely cache successful returns from AWS and use the cached version.

What should we use for caching? If you guessed ETS, you're right!

Here's the cached version:

defmodule FeatureFlags do
  use GenServer

  @table :features

  def start_link(_args) do
    GenServer.start_link(__MODULE__, nil, name: __MODULE__)
  end

  def is_enabled?(name) do
    :ets.lookup_element(@table, name, 2)
  rescue
    _ -> false
  end

  def init(nil) do
    for feature_name <- get_list_of_enabled_flags() do
      :ets.insert(@table, {feature_name, true})
    end

    {:ok, nil}
  end

  defp get_list_of_enabled_flags() do
    # ... Access AWS, fetch values for specific project/env
  end
end
Enter fullscreen mode Exit fullscreen mode

The first difference is the use of a GenServer. As previously mentioned, ETS tables are not automatically garbage collected. When started by a GenServer, the ETS will be garbage-collected together with the GenServer process, in case it exists.

In init/1, we fetch all the features from AWS and insert them in the ETS table. The init/1 function is called only once. After that, all the requests to the FeaturesFlags modules are going to use the cached version.

(In that example, for simplicity reasons, we are not dealing with error handling or the update of the features).

With our modified version, new calls to the FeatureFlags.enabled?/1 uses ETS to look up the values.

You might be wondering: "But since you're using a GenServer, why not use the GenServer state itself to hold the feature list instead?".

And that's a totally valid point. The main reason behind that decision is that every call to access a GenServer state (e.g. using handle_call or handle_cast), goes to the process mailbox and is processed serially. That's the reason why FeatureFlags.enabled?/1 uses the ETS directly, not calling the GenServer state.

That approach achieves two essential points:

  • It makes the ETS garbage collected by design and
  • It achieves concurrency, by not serializing requests inside the GenServer.

Summary

Developers building applications using Elixir are fortunate to have ETS available as part of the Erlang VM toolbelt.

ETS provides a right mix of simplicity (removing the need for integrating yet another tool to a project) with performance (it's a fast and battle-tested piece of software).

Keep in mind that, like anything, ETS is not suitable for every caching problem. There are cases where ETS is not a good idea, and you need to rely on other tools to solve the problem, so take this content with a grain of salt and make sure ETS meets your needs before sticking to it.

Top comments (0)