Async Python
Directly from the Python docs:
asyncio is a library to write concurrent code using the async/await syntax.
asyncio is used as a foundation for multiple Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc.
asyncio is often a perfect fit for IO-bound and high-level structured network code.
A typical use case for asynchronous code is web servers based on the ASGI-spec, such as FastAPI, LiteStar, etc.
However, sync and async code do not play nicely together.
Often, we need an async library, but there isn't one available, or vice versa.
This article will discuss a simple pattern for writing Python packages compatible with both Synchronous and Asynchronous code, while only having to write the functionality once! (no separate libraries)
Combining Sync and Async Code
First I will briefly describe the issues with mixing these two paradigms.
How Async Works
- Each Python interpreter runs in a process on the system.
- Each Python process has an event loop that can run async code.
- An event loop can run multiple pieces of code (coroutines) concurrently, awaiting the return of something while another piece of code is executing.
- This has huge benefits for IO-bound tasks, such as downloading multiple files or sending simultaneous database queries.
Sync From Async
- Doing this will typically block the event loop for executing other async code.
- This means that when you hit a piece of synchronous code, you essentially block everything below it from running until it completes, negating most of the benefits of async.
- It also means that if async tasks were started prior to the sync code running, the execution of these tasks will not proceed until the sync code allows the event loop to run again. This problem is compounded by CPU-heavy tasks, which do not yield control back to the event loop.
Async From Sync
- Synchronous code cannot use the
async
/await
keywords, hence async code cannot be executed in it's normal manner. - To get around this, async code can be run in a newly spawned event loop (
asyncio.run()
orloop.run_until_complete()
), a separate thread (ThreadpoolExecutor
) or process (ProcessPoolExecutor
). - Doing this significantly complicates codes (possibly introducing threading issues, multiple loops, and unexpected blocking), while providing none of the benefits of having async code in the first place.
How To Write A Package That Is Both Sync / Async
- Always write async first!
- It's much easier to remove the async specific tokens from code using a tokeniser script.
- Once we have our asynchronous implementation, we can use a script to convert it into a synchronous equivalent.
- This way you can release a package with both implementations available to your users: use the correct implementation for your use case.
- There is a nice package called unasync that can do this for you, but the simplest and cleanest implementation I have found was in the encode/httpcore package:
#!venv/bin/python
import os
import re
import sys
from pprint import pprint
SUBS = [
('from .._backends.auto import AutoBackend', 'from .._backends.sync import SyncBackend'),
('import trio as concurrency', 'from tests import concurrency'),
('AsyncIterator', 'Iterator'),
('Async([A-Z][A-Za-z0-9_]*)', r'\2'),
('async def', 'def'),
('async with', 'with'),
('async for', 'for'),
('await ', ''),
('handle_async_request', 'handle_request'),
('aclose', 'close'),
('aiter_stream', 'iter_stream'),
('aread', 'read'),
('asynccontextmanager', 'contextmanager'),
('__aenter__', '__enter__'),
('__aexit__', '__exit__'),
('__aiter__', '__iter__'),
('@pytest.mark.anyio', ''),
('@pytest.mark.trio', ''),
('AutoBackend', 'SyncBackend'),
]
COMPILED_SUBS = [
(re.compile(r'(^|\b)' + regex + r'($|\b)'), repl)
for regex, repl in SUBS
]
USED_SUBS = set()
def unasync_line(line):
for index, (regex, repl) in enumerate(COMPILED_SUBS):
old_line = line
line = re.sub(regex, repl, line)
if old_line != line:
USED_SUBS.add(index)
return line
def unasync_file(in_path, out_path):
with open(in_path, "r") as in_file:
with open(out_path, "w", newline="") as out_file:
for line in in_file.readlines():
line = unasync_line(line)
out_file.write(line)
def unasync_file_check(in_path, out_path):
with open(in_path, "r") as in_file:
with open(out_path, "r") as out_file:
for in_line, out_line in zip(in_file.readlines(), out_file.readlines()):
expected = unasync_line(in_line)
if out_line != expected:
print(f'unasync mismatch between {in_path!r} and {out_path!r}')
print(f'Async code: {in_line!r}')
print(f'Expected sync code: {expected!r}')
print(f'Actual sync code: {out_line!r}')
sys.exit(1)
def unasync_dir(in_dir, out_dir, check_only=False):
for dirpath, dirnames, filenames in os.walk(in_dir):
for filename in filenames:
if not filename.endswith('.py'):
continue
rel_dir = os.path.relpath(dirpath, in_dir)
in_path = os.path.normpath(os.path.join(in_dir, rel_dir, filename))
out_path = os.path.normpath(os.path.join(out_dir, rel_dir, filename))
print(in_path, '->', out_path)
if check_only:
unasync_file_check(in_path, out_path)
else:
unasync_file(in_path, out_path)
def main():
check_only = '--check' in sys.argv
unasync_dir("httpcore/_async", "httpcore/_sync", check_only=check_only)
unasync_dir("tests/_async", "tests/_sync", check_only=check_only)
if len(USED_SUBS) != len(SUBS):
unused_subs = [SUBS[i] for i in range(len(SUBS)) if i not in USED_SUBS]
print("These patterns were not used:")
pprint(unused_subs)
exit(1)
if __name__ == '__main__':
main()
All credit to the devs at encode for the implementation.
Without digging into the code too deeply, it should be quite obvious from the SUBS
param what this script does - converting async syntax to sync syntax.
Integrating Unasync.py
- Add the above
unasync.py
code to your repo. - Place your async code in a
_async
directory. - Modify the
SUBS
param andunasync_dir
usage inmain()
to match your project structure. - Run the script to generate the sync code equivalent in the
_sync
directory.
I followed this exact approach in a recent project I worked on with a volunteer at HOTOSM (Emir, an excellent dev!).
We were looking at a nice approach for making the package available for both sync and async (FastAPI) use cases, and the solution was surprisingly simple, but poorly documented online!
The full project an implementation can be viewed here
Optional: Use Via Pre-Commit Hook
- In the linked project above, I set a pre-commit hook to trigger the
unasync.py
script on each commit. - This means the synchronous code never gets out of sync (ha!) with the asynchronous code.
- The config for the hook was:
repos:
# Unasync: Convert async --> sync
- repo: local
hooks:
- id: unasync
name: unasync-all
language: system
entry: python unasync.py
always_run: true
pass_filenames: false
Top comments (0)