In this post, you'll learn how to use ETS as a caching mechanism in your Elixir applications, get familiar with different available options, and be made aware of some things to keep in mind.
The Concept of Caching
My two-year-old son loves eating cookies. He loves them so much that even while playing soccer, he keeps going back to the jar to grab one more.
To stop playing so as to get another cookie is not something he enjoys, so he started grabbing multiple cookies and keeping them in his hands.
That concept of keeping things that are important, but are costly to get, close to the subject that needs to use them, is commonly referred to as caching.
In computing, caching is a ubiquitous concept that is used in both hardware and software. It has been around for a long time and is a smart concept, ported from real-life situations.
How Popular Languages Deal with Caching
Popular programming languages usually rely on external dependencies such as Memcache or Redis for caching. Using these tools is almost a standard practice now.
Although that's a valid approach, it introduces yet another dependency to a project. As simple as it initially seems, the operational costs of keeping that dependency alive (running, monitoring, patching with the latest security updates, etc.), might be overwhelming when the requirements are rather simple.
Cache in Elixir (and Erlang)
In Elixir, the need for caching information is common as a way of avoiding the unnecessary hurdle of accessing information that has a high-cost tag attached to it.
Elixir comes with a huge advantage that might help simplify the lives of developers in need of caching—ETS.
ETS stands for Erlang Term Storage. It enables developers to store and access data using a key.
Here's a small example of the usage of an ETS table:
iex(4)> :ets.new(:security_level, [:named_table])
:security_level
iex(5)> :ets.insert(:security_level, {1, :high})
true
iex(6)> :ets.insert(:security_level, {2, :low})
true
iex(7)> :ets.insert(:security_level, {3, :none})
true
iex(8)> :ets.lookup(:security_level, 1)
[{1, :high}]
In that example, an ETS table named :security_level
is created, and a couple of values are inserted into it.
ETS tables have four different types:
Set: That's the default type—the one used in the above example. Each key can occur only once.
Ordered set: An ordered set has the same property as the set, but ordered by Erlang/Elixir term.
Bag: An ETS using the "bag" type supports multiple items per key.
Duplicate bag: A "Duplicate bag" allows both duplicated keys and items.
The best ETS type to use depends on the specific needs of an application.
The data structures used to implement the ETS tables in the Erlang VM are optimized to provide the best possible access time. Depending on the type, the Erlang VM uses either Hash Tables or Binary Trees to represent the ETS table. When compared to linear access times for a list, both cases have better performance.
Here are two examples where I recently used ETS for caching purposes:
Feature Flags
We use continuous delivery at the company I work in, and feature flags are crucial if you're doing trunk-based-development and want to integrate new, albeit not finished code, without enabling it for customers.
We store our feature flags in AWS Parameter Store, fetching them once every 5 minutes and caching them locally in an ETS table.
Soft Real-Time Stats
There are several cases where we need to display things like "X people have purchased that plan" or "That product was already used by Y customers".
A naive approach would be to perform database queries every time such values are required, but that would be extremely heavy (and unnecessary) on our database.
We cache those values and update them every couple of minutes. Although they might not reflect the real number, they indicate a "good enough" number, which is meaningful to customers.
Things to Keep in Mind When Using ETS as a Cache Solution
Integration of Unnecessary Dependencies
ETS is part of the Erlang VM, so there are no additional dependencies necessary when using it.
Optimized Lookup
The lookup of an item (or group of items) is optimized depending on the type of the ETS table.
No Serialization
ETS tables can store Elixir data structures. This means you can store a group of %Accounts.User{...}
or %PlayerScore{}
and you'll be able to fetch those structures back without having to serialize them.
Not Garbage Collected
When an ETS table is no longer necessary, you must delete it manually using :ets.delete
, otherwise, it might stay around forever.
A common solution is to create a table inside a process (e.g. GenServer
). Since the ETS table is linked to the process that created it, if the process dies (or is terminated), the table also gets "garbage collected".
Use of ETS in Distributed Systems
ETS plays nice when using distributed Elixir, but in both the examples I shared, the projects didn't connect multiple nodes in a distributed Elixir fashion. Nevertheless, those features are part of a project deployed to several web servers, each one with its copy of ETS locally.
In both examples—"Feature Flags" and "Soft real-time stats"—ETS works well because Eventual Consistency is an acceptable side-effect.
How to Create and Use ETS Within a GenServer
Previously, I mentioned a case where I used ETS for "Feature Flags". In this section, I want to give you a better understanding of its internals.
The non-cached code is quite simple:
defmodule FeatureFlags do
def enabled?(name) do
name in get_list_of_enabled_flags()
end
defp get_list_of_enabled_flags() do
# ... Access AWS, fetch values for specific project/env
end
end
FeatureFlags.enabled?/1
returns either true
if the feature flag is enabled, or false
when it is not enabled.
The problem with this code lies in the fact that for every call to FeatureFlags.enabled?/1
, the code performs requests to AWS and parses the response.
Since the changes to the features happen once or twice a day, we can safely cache successful returns from AWS and use the cached version.
What should we use for caching? If you guessed ETS, you're right!
Here's the cached version:
defmodule FeatureFlags do
use GenServer
@table :features
def start_link(_args) do
GenServer.start_link(__MODULE__, nil, name: __MODULE__)
end
def is_enabled?(name) do
:ets.lookup_element(@table, name, 2)
rescue
_ -> false
end
def init(nil) do
for feature_name <- get_list_of_enabled_flags() do
:ets.insert(@table, {feature_name, true})
end
{:ok, nil}
end
defp get_list_of_enabled_flags() do
# ... Access AWS, fetch values for specific project/env
end
end
The first difference is the use of a GenServer
. As previously mentioned, ETS tables are not automatically garbage collected. When started by a GenServer
, the ETS will be garbage-collected together with the GenServer
process, in case it exists.
In init/1
, we fetch all the features from AWS and insert them in the ETS table. The init/1
function is called only once. After that, all the requests to the FeaturesFlags
modules are going to use the cached version.
(In that example, for simplicity reasons, we are not dealing with error handling or the update of the features).
With our modified version, new calls to the FeatureFlags.enabled?/1
uses ETS to look up the values.
You might be wondering: "But since you're using a GenServer
, why not use the GenServer
state itself to hold the feature list instead?".
And that's a totally valid point. The main reason behind that decision is that every call to access a GenServer
state (e.g. using handle_call
or handle_cast
), goes to the process mailbox and is processed serially. That's the reason why FeatureFlags.enabled?/1
uses the ETS directly, not calling the GenServer
state.
That approach achieves two essential points:
- It makes the ETS garbage collected by design and
- It achieves concurrency, by not serializing requests inside the
GenServer
.
Summary
Developers building applications using Elixir are fortunate to have ETS available as part of the Erlang VM toolbelt.
ETS provides a right mix of simplicity (removing the need for integrating yet another tool to a project) with performance (it's a fast and battle-tested piece of software).
Keep in mind that, like anything, ETS is not suitable for every caching problem. There are cases where ETS is not a good idea, and you need to rely on other tools to solve the problem, so take this content with a grain of salt and make sure ETS meets your needs before sticking to it.
Top comments (0)