DEV Community

Cover image for Introducing BlockBuster: is my asyncio event loop blocked?
Christophe Bornet
Christophe Bornet

Posted on

Introducing BlockBuster: is my asyncio event loop blocked?

Asynchronous I/O was introduced in Python 3.5 as an alternative to threads to handle concurrency.
The promises of asynchronous I/O, and of the asyncio implementation in Python, are that by not spawning memory-expensive OS threads, systems use less resources and are more scalable. Also, in asyncio, schedule points are explicit through the await syntax whereas in thread-based concurrency, the GIL may be released at hard to guess points of the code. So asyncio based concurrent systems are easier to reason about and to debug. Eventually, it's possible to cancel asyncio tasks which is not easily doable with threads.

But to really benefit from these promises, it is very important to not do blocking calls in async coroutines. A blocking call can be a network call, a file system call, a sleep call and so on.
These blocking calls are harmful because under the hood, asyncio uses a single-threaded event loop where coroutines are run concurrently. So if a blocking call is made in a coroutine, it blocks the entire event loop and all the coroutines. This impacts the overall performance of the application.

Here is an example where a blocking call prevents concurrent execution of the code:

import asyncio
import datetime
import time

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    time.sleep(1) # time.sleep is a blocking function
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

When run, this outputs something like:

2025-01-07 18:50:15.327677: 1 start
2025-01-07 18:50:16.328330: 1 stop
2025-01-07 18:50:16.328404: 2 start
2025-01-07 18:50:17.333159: 2 stop
Enter fullscreen mode Exit fullscreen mode

We see that the 2 coroutines were not run concurrently.

To overcome this you need to use a non-blocking equivalent or defer the execution to a thread-pool:

import asyncio
import datetime
import time

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    await asyncio.sleep(1) # replaced the blocking time.sleep call by the non-blocking asyncio.sleep coroutine
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode

When run, this outputs something like:

2025-01-07 18:53:53.579738: 1 start
2025-01-07 18:53:53.579797: 2 start
2025-01-07 18:53:54.580463: 1 stop
2025-01-07 18:53:54.580572: 2 stop
Enter fullscreen mode Exit fullscreen mode

Here the 2 coroutines were run concurrently.

Now the problem is that it is not always easy to identify whether a method is blocking or not. Especially if the code base is big or uses third-party libraries. Sometimes blocking calls are made in deeply buried parts of the code.

For instance, is this code blocking ?

import blockbuster
from importlib.metadata import version

async def get_version():
    return version("blockbuster")
Enter fullscreen mode Exit fullscreen mode

Did Python load package metadata in memory at startup? Is it done when loading the blockbuster module? Or when we call version()? Is the result cached and subsequent calls will be non-blocking?
The correct answer is that it is done when calling version() and it involves reading the METADATA file of the installed package. And the result is not cached. So version() is a blocking call and should always be deferred to a thread. This fact is hard to know without diving into the code of importlib.

One way of detecting blocking calls is to activate asyncio's debug mode to log blocking calls that take too long. But that's not the most efficient way as a lot of shorter than the trigger timeout blockings are still harmful for the performance and blocking times in test/development may be different than in production. For instance a database call may take longer in production if it has to fetch a lot of data.

This is where BlockBuster saves the day!
When activated, BlockBuster will monkey-patch several blocking Python framework methods to make them raise an error if they are called from an asyncio event loop.
The default patched methods include methods for os, io, time, socket, sqlite modules. For a full list of methods detected by BlockBuster, see the project README.
Then you can activate BlockBuster during your unit tests or in development mode to catch any blocking calls and fix them.
If you know the awesome BlockHound library in the JVM, it's the same principle but for Python. BlockHound was a great inspiration for BlockBuster, kudos to the creators.

Let's see how to use BlockBuster on the above snippet of blocking code.

First, we need to install the blockbuster package

pip install blockbuster
Enter fullscreen mode Exit fullscreen mode

Then we can use a pytest fixture and the blockbuster_ctx() method to activate BlockBuster at the beginning of each test and deactivate it during teardown.

import asyncio
import datetime
import pytest
import time
from blockbuster import blockbuster_ctx

async def example(name):
    print(f"{datetime.datetime.now()}: {name} start")
    time.sleep(1)
    print(f"{datetime.datetime.now()}: {name} stop")

async def main():
    await asyncio.gather(example("1"), example("2"))

@pytest.fixture(autouse=True)
def blockbuster():
    with blockbuster_ctx() as bb:
        yield bb

async def test_main():
    await main()
Enter fullscreen mode Exit fullscreen mode

If you run this with pytest you get

FAILED test_example.py::test_main - blockbuster.blockbuster.BlockingError: Blocking call to sleep (<module 'time' (built-in)>)
Enter fullscreen mode Exit fullscreen mode

Note: typically, in a real project the blockbuster() fixture would be set in a conftest.py file.

Conclusion

I believe BlockBuster is quite useful in asyncio projects. It has already helped me detect a lot of blocking call issues on projects I work on.
But it's not a silver bullet. In particular, some third-party libraries don't use Python framework methods to interact with the network or the file system and instead wrap a C library. For these, it is possible to add rules to trigger on blocking calls of these libraries in your test setup. Also BlockBuster is open-source: contributions are very welcome to add rules for your favourite library in the core project.
And if you see issues and things that could be improved, I will be pleased to get your feedback in the project issue tracker.

Some links:

Top comments (0)