`async: false` is the worst. Here's how to never need it again.

JB Steadman

A common challenge when testing Elixir apps is dealing with shared/global state. When multiple tests modify/access the same piece of state, in order to ensure the tests don’t interfere with each other, we need to serialize the tests using async: false.

To isolate tests from each other and avoid async: false, here we explore a technique for “localizing” global state. That is, we reduce the scope of in-memory state such that each ExUnit test has its own private copy of the state. With global state thus localized, our ExUnit tests are decoupled and safe for async: true.

This post will focus on how the technique works. Future posts will demonstrate how to use the technique for specific purposes, such as:

  • Customizing environment variables in tests, while preserving async: true
  • Testing code that uses “singleton” GenServers
  • Using protocols, rather than behaviours, to model external services

Processes all the way down

Let’s set the stage by examining the process architecture of an ExUnit test suite:

ExUnit starts a new process for every test module. When running with async: true, there may be many such processes running at once. Each test module process runs one individual test at a time. The individual test processes each spawn a supervisor process for use with ExUnit’s start_supervised() functions.

Individual test processes may spawn children of their own - Tasks, GenServers, Supervisors, Agents, or raw processes started via spawn() - sometimes directly, other times via the Test Supervisor. Some of these child processes may be started by our test code, others by our production code. In some cases, our own production code starts child processes directly. In other cases, child processes are started by production library code outside of our control - for example when Phoenix spawns a LiveView process. Some child processes start children of their own.

Any of these child processes might need a particular piece of data. Particularly when a process is started outside of our direct control, it might not be possible or convenient to initialize a process with the data that it needs.

Because it can be hard to get an individual piece of data to a process that needs it, we sometimes reach for the illusory convenience of global references. For example, we might store the data in the Application environment under a static, compile-time key. This creates complications if we want to customize the data for our tests.

There’s a better way!

Instead of resorting to global references, we can take advantage of Erlang/Elixir’s process architecture to scope data to individual tests. To accomplish this, we use two features of Erlang’s process APIs: the process dictionary, and the process ancestry hierarchy. Before taking a closer look at each of those features, let’s see the high-level basics of how we use them.

For any otherwise-global data that we want to localize to a test, we first write the data to the process dictionary of the ExUnit test pid. For example, imagine we’ve been storing a piece of data in the Application environment under the key :my_data, and imagine we want to customize the value in a test. Writing the data to the process dictionary of our test pid will look something like this:

test "my code works correctly" do
  Process.put(:my_data, "custom value specific to this test")

  # start child processes, run our code, make assertions, etc. ...
end

Step two is retrieving the data. Let’s imagine our test spawns a small subtree of processes, for example:

Let’s suppose we need to access :my_data from :child_b. For this, we use a module called ProcessTree - available on Hex. From within the code executed by :child_b, we call ProcessTree.get(). For example:

def get_my_data() do
  ProcessTree.get(:my_data, default: Application.get_env(:my_app, :my_data))
end

To find the value for :my_data, ProcessTree.get() looks first in the process dictionary of the calling process - in this case, :child_b. When it doesn’t find a value there, it next looks in the process dictionary of the parent process - :child_a

If no value is found in the parent’s process dictionary, the ProcessTree.get() keeps looking up the process hierarchy until it finds the value that we’ve stored in the process dictionary of the ExUnit test pid.

In production, when we only need a single, unchanging value for :my_data, we can fall back to using Application.get_env(). As indicated above, ProcessTree.get() will return the default value that we pass in, after caching it in the process dictionary of the calling process.

To recap, our two step process is:

  1. Store the data in the process dictionary of our test pid
  2. In our production code, use ProcessTree.get() to look up the process hierarchy until we find the data

Now that we’ve seen the basics, let’s take a closer look at the details.

What’s the process dictionary?

The process dictionary is an in-memory hashtable that’s part of the struct for each Erlang process. It enables storage and retrieval of arbitrary process-specific data. Perhaps surprisingly, the process dictionary is mutable state, though it’s only mutable by the process that owns the dictionary.

The process dictionary is implemented in C. Operations on it are exceedingly fast. There is no performance penalty for using it sensibly. In fact, in “don’t-try-this-at-home” style, Elixir creator José Valim used the process dictionary, in place of an Elixir Map, to improve the performance of a piece of Broadway code.

Process dictionary: friend or foe?

The process dictionary is conceptually identical to thread-local storage in languages like Java or Python. Because it can be misused, thread-local storage is sometimes controversial. Likewise for the process dictionary.

Those urging avoidance of the process dictionary sometimes point out that we should pass data down the stack not with thread-local storage, but rather through ordinary function parameters. They are 100% correct that we should use function parameters whenever possible. Function parameters are always Plan A. However, especially in multi-process environments such as Phoenix applications, it’s not always possible to pass data through stack layers via function parameters.

Another common warning is that the process dictionary is akin to global variables, or that relying on it is a slippery slope to global variables. Crucially however, in our case, we are using the process dictionary to make data less global, not more - a very worthy motivation.

You’re already using the process dictionary

The process dictionary is used by two ubiquitous pieces of Elixir code: Ecto SQL and Logger. In a manner similar to the approach discussed here, Ecto SQL uses the process dictionary to scope the use of database connections to specific processes. Within ExUnit tests, that’s how the Ecto Sandbox works - an ExUnit pid checks out a DB connection from a connection pool and stores it in the process dictionary.

The approach outlined above also requires navigating up the process ancestry hierarchy. But how? As of OTP 25.0, we can use Process.info/2 to find the parent of any Elixir process. For earlier (and current) versions of OTP, ancestors for OTP processes only can be found under the undocumented $ancestors key in the process dictionary. Elixir’s Task also supports $ancestors, and a similar mechanism using the $callers key.

For a more complete discussion, see How To Get the Parent of an Elixir Process.

Packaging it all together

ProcessTree manages the complications of navigating the process ancestry hierarchy. It’s compatible with all recent versions of OTP, using Process.info/2 when available, and falling back to $ancestors when necessary. It was developed specifically for avoiding global state in Elixir applications, and its developer would be delighted if you checked it out :-)