DEV Community

Cover image for Combining sync and async Python code: writing a DRY package
Sam
Sam

Posted on • Edited on • Originally published at spwoodcock.dev

Combining sync and async Python code: writing a DRY package

Async Python

Directly from the Python docs:

asyncio is a library to write concurrent code using the async/await syntax.

asyncio is used as a foundation for multiple Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc.

asyncio is often a perfect fit for IO-bound and high-level structured network code.

A typical use case for asynchronous code is web servers based on the ASGI-spec, such as FastAPI, LiteStar, etc.

However, sync and async code do not play nicely together.

Often, we need an async library, but there isn't one available, or vice versa.

This article will discuss a simple pattern for writing Python packages compatible with both Synchronous and Asynchronous code, while only having to write the functionality once! (no separate libraries)

Combining Sync and Async Code

First I will briefly describe the issues with mixing these two paradigms.

How Async Works

  • Each Python interpreter runs in a process on the system.
  • Each Python process has an event loop that can run async code.
  • An event loop can run multiple pieces of code (coroutines) concurrently, awaiting the return of something while another piece of code is executing.
  • This has huge benefits for IO-bound tasks, such as downloading multiple files or sending simultaneous database queries.

Sync From Async

  • Doing this will typically block the event loop for executing other async code.
  • This means that when you hit a piece of synchronous code, you essentially block everything below it from running until it completes, negating most of the benefits of async.
  • It also means that if async tasks were started prior to the sync code running, the execution of these tasks will not proceed until the sync code allows the event loop to run again. This problem is compounded by CPU-heavy tasks, which do not yield control back to the event loop.

Async From Sync

  • Synchronous code cannot use the async/await keywords, hence async code cannot be executed in it's normal manner.
  • To get around this, async code can be run in a newly spawned event loop (asyncio.run() or loop.run_until_complete()), a separate thread (ThreadpoolExecutor) or process (ProcessPoolExecutor).
  • Doing this significantly complicates codes (possibly introducing threading issues, multiple loops, and unexpected blocking), while providing none of the benefits of having async code in the first place.

How To Write A Package That Is Both Sync / Async

  • Always write async first!
  • It's much easier to remove the async specific tokens from code using a tokeniser script.
  • Once we have our asynchronous implementation, we can use a script to convert it into a synchronous equivalent.
  • This way you can release a package with both implementations available to your users: use the correct implementation for your use case.
  • There is a nice package called unasync that can do this for you, but the simplest and cleanest implementation I have found was in the encode/httpcore package:
#!venv/bin/python
import os
import re
import sys
from pprint import pprint

SUBS = [
    ('from .._backends.auto import AutoBackend', 'from .._backends.sync import SyncBackend'),
    ('import trio as concurrency', 'from tests import concurrency'),
    ('AsyncIterator', 'Iterator'),
    ('Async([A-Z][A-Za-z0-9_]*)', r'\2'),
    ('async def', 'def'),
    ('async with', 'with'),
    ('async for', 'for'),
    ('await ', ''),
    ('handle_async_request', 'handle_request'),
    ('aclose', 'close'),
    ('aiter_stream', 'iter_stream'),
    ('aread', 'read'),
    ('asynccontextmanager', 'contextmanager'),
    ('__aenter__', '__enter__'),
    ('__aexit__', '__exit__'),
    ('__aiter__', '__iter__'),
    ('@pytest.mark.anyio', ''),
    ('@pytest.mark.trio', ''),
    ('AutoBackend', 'SyncBackend'),
]
COMPILED_SUBS = [
    (re.compile(r'(^|\b)' + regex + r'($|\b)'), repl)
    for regex, repl in SUBS
]

USED_SUBS = set()

def unasync_line(line):
    for index, (regex, repl) in enumerate(COMPILED_SUBS):
        old_line = line
        line = re.sub(regex, repl, line)
        if old_line != line:
            USED_SUBS.add(index)
    return line


def unasync_file(in_path, out_path):
    with open(in_path, "r") as in_file:
        with open(out_path, "w", newline="") as out_file:
            for line in in_file.readlines():
                line = unasync_line(line)
                out_file.write(line)


def unasync_file_check(in_path, out_path):
    with open(in_path, "r") as in_file:
        with open(out_path, "r") as out_file:
            for in_line, out_line in zip(in_file.readlines(), out_file.readlines()):
                expected = unasync_line(in_line)
                if out_line != expected:
                    print(f'unasync mismatch between {in_path!r} and {out_path!r}')
                    print(f'Async code:         {in_line!r}')
                    print(f'Expected sync code: {expected!r}')
                    print(f'Actual sync code:   {out_line!r}')
                    sys.exit(1)


def unasync_dir(in_dir, out_dir, check_only=False):
    for dirpath, dirnames, filenames in os.walk(in_dir):
        for filename in filenames:
            if not filename.endswith('.py'):
                continue
            rel_dir = os.path.relpath(dirpath, in_dir)
            in_path = os.path.normpath(os.path.join(in_dir, rel_dir, filename))
            out_path = os.path.normpath(os.path.join(out_dir, rel_dir, filename))
            print(in_path, '->', out_path)
            if check_only:
                unasync_file_check(in_path, out_path)
            else:
                unasync_file(in_path, out_path)


def main():
    check_only = '--check' in sys.argv
    unasync_dir("httpcore/_async", "httpcore/_sync", check_only=check_only)
    unasync_dir("tests/_async", "tests/_sync", check_only=check_only)

    if len(USED_SUBS) != len(SUBS):
        unused_subs = [SUBS[i] for i in range(len(SUBS)) if i not in USED_SUBS]

        print("These patterns were not used:")
        pprint(unused_subs)
        exit(1)   


if __name__ == '__main__':
    main()
Enter fullscreen mode Exit fullscreen mode

All credit to the devs at encode for the implementation.

Without digging into the code too deeply, it should be quite obvious from the SUBS param what this script does - converting async syntax to sync syntax.

Integrating Unasync.py

  1. Add the above unasync.py code to your repo.
  2. Place your async code in a _async directory.
  3. Modify the SUBS param and unasync_dir usage in main() to match your project structure.
  4. Run the script to generate the sync code equivalent in the _sync directory.

I followed this exact approach in a recent project I worked on with a volunteer at HOTOSM (Emir, an excellent dev!).

We were looking at a nice approach for making the package available for both sync and async (FastAPI) use cases, and the solution was surprisingly simple, but poorly documented online!

The full project an implementation can be viewed here

Optional: Use Via Pre-Commit Hook

  • In the linked project above, I set a pre-commit hook to trigger the unasync.py script on each commit.
  • This means the synchronous code never gets out of sync (ha!) with the asynchronous code.
  • The config for the hook was:
repos:
  # Unasync: Convert async --> sync
  - repo: local
    hooks:
      - id: unasync
        name: unasync-all
        language: system
        entry: python unasync.py
        always_run: true
        pass_filenames: false
Enter fullscreen mode Exit fullscreen mode

Top comments (0)