Asynchronous code has become a mainstay of Python development. With asyncio becoming part of the standard library and many third party packages providing features compatible with it, this paradigm is not going away anytime soon.
If you're writing asynchronous code, it's important to make sure all parts of your code are working together so one aspect of it isn't slowing everything else down. File I/O can be a common blocker on this front, so let's walk through how to use the aiofiles library to work with files asynchronously.
Starting with the basics, this is all the code you need to read the contents of a file asynchronously (within an async function):
async with aiofiles.open('filename', mode='r') as f:
contents = await f.read()
print(contents)
Let's move on and dig deeper.
What is non-blocking code?
You may hear terms like "asynchronous", "non-blocking" or "concurrent" and be a little confused as to what they all mean. According to this much more detailed tutorial, two of the primary properties are:
- Asynchronous routines are able to “pause” while waiting on their ultimate result to let other routines run in the meantime.
- Asynchronous code, through the mechanism above, facilitates concurrent execution. To put it differently, asynchronous code gives the look and feel of concurrency.
So asynchronous code is code that can hang while waiting for a result, in order to let other code run in the meantime. It doesn't "block" other code from running so we can call it "non-blocking" code.
The asyncio library provides a variety of tools for Python developers to do this, and aiofiles provides even more specific functionality for working with files.
Setting Up
Make sure to have your Python environment setup before we get started. Follow this guide up through the virtualenv section if you need some help. Getting everything working correctly, especially with respect to virtual environments is important for isolating your dependencies if you have multiple projects running on the same machine. You will need at least Python 3.7 or higher in order to run the code in this post.
Now that your environment is set up, you’re going to need to install some third party libraries. We’re going to use aiofiles so install this with the following command after activating your virtual environment:
pip install aiofiles==0.6.0
For the examples in the rest of this post, we'll be using JSON files of Pokemon API data corresponding to the original 150 Pokemon. You can download a folder with all of those here. With this you should be ready to move on and write some code.
Reading from a file with aiofiles
Let's begin with by simply opening a file corresponding to a particular Pokemon, parsing its JSON into a dictionary, and printing out its name:
import aiofiles
import asyncio
import json
async def main():
async with aiofiles.open('articuno.json', mode='r') as f:
contents = await f.read()
pokemon = json.loads(contents)
print(pokemon['name'])
asyncio.run(main())
When running this code, you should see "articuno" printed to the terminal. You can also iterate through the file asynchronously, line by line (this code will print out all 9271 lines of articuno.json
):
import aiofiles
import asyncio
async def main():
async with aiofiles.open('articuno.json', mode='r') as f:
async for line in f:
print(line)
asyncio.run(main())
Writing to a file with aiofiles
Writing to a file is also similar to standard Python file I/O. Let's say we wanted to create files containing a list of all moves that each Pokemon can learn. For a simple example, here's what we would do for the Pokemon Ditto, who can only learn the move "transform":
import aiofiles
import asyncio
async def main():
async with aiofiles.open('ditto_moves.txt', mode='w') as f:
await f.write('transform')
asyncio.run(main())
Let's try this with a Pokemon that has more than one move, like Rhydon:
import aiofiles
import asyncio
import json
async def main():
# Read the contents of the json file.
async with aiofiles.open('rhydon.json', mode='r') as f:
contents = await f.read()
# Load it into a dictionary and create a list of moves.
pokemon = json.loads(contents)
name = pokemon['name']
moves = [move['move']['name'] for move in pokemon['moves']]
# Open a new file to write the list of moves into.
async with aiofiles.open(f'{name}_moves.txt', mode='w') as f:
await f.write('\n'.join(moves))
asyncio.run(main())
If you open up rhydon_moves.txt
you should see a file with 112 lines that starts something like this.
Using asyncio to go through many files asynchronously
Now let's get a little more complicated and do this for all 150 Pokemon that we have JSON files for. Our code will have to read from every file, parse the JSON, and rewrite each Pokemon's moves to a new file:
import aiofiles
import asyncio
import json
from pathlib import Path
directory = 'directory/your/files/are/in'
async def main():
pathlist = Path(directory).glob('*.json')
# Iterate through all json files in the directory.
for path in pathlist:
# Read the contents of the json file.
async with aiofiles.open(f'{directory}/{path.name}', mode='r') as f:
contents = await f.read()
# Load it into a dictionary and create a list of moves.
pokemon = json.loads(contents)
name = pokemon['name']
moves = [move['move']['name'] for move in pokemon['moves']]
# Open a new file to write the list of moves into.
async with aiofiles.open(f'{directory}/{name}_moves.txt', mode='w') as f:
await f.write('\n'.join(moves))
asyncio.run(main())
After running this code, you should see the directory of Pokemon files populated with .txt
files alongside the .json
ones, containing move lists corresponding to each Pokemon.
If you need to perform some asynchronous actions and want to end with data corresponding to those asynchronous tasks, such as a list with each Pokemon's moves after having written the files, you can use asyncio.ensure_future
and asyncio.gather
.
You can break out the portion of your code that handles each file into its own async function, and append promises for those function calls to a list of tasks. Here's an example of what that function, and your new main
function would look like:
async def write_pokemon_moves(filename):
# Read the contents of the json file.
async with aiofiles.open(f'{directory}/{filename}', mode='r') as f:
contents = await f.read()
# Load it into a dictionary and create a list of moves.
pokemon = json.loads(contents)
name = pokemon['name']
moves = [move['move']['name'] for move in pokemon['moves']]
# Open a new file to write the list of moves into.
async with aiofiles.open(f'{directory}/{name}_moves.txt', mode='w') as f:
await f.write('\n'.join(moves))
return { 'name': name, 'moves': moves }
async def main():
pathlist = Path(directory).glob('*.json')
# A list to be populated with async tasks.
tasks = []
# Iterate through all json files in the directory.
for path in pathlist:
tasks.append(asyncio.ensure_future(write_pokemon_moves(path.name)))
# Will contain a list of dictionaries containing Pokemons' names and moves
moves_list = await asyncio.gather(*tasks)
This is a common way to utilize asynchronous code in Python, and is often used for things like making HTTP requests.
So what do I use this for?
The examples in this post using data from the Pokemon were just an excuse to show the functionality of the aiofiles module, and how you would write code to navigate through a directory of files for reading and writing. Hopefully, you can adapt these code samples to the specific problems you're trying to solve so file I/O doesn't become a blocker in your asynchronous code.
We have only scratched the surface of what you can do with aiohttp and asyncio, but I hope that this has made starting your journey into the world of asynchronous Python a little easier.
I’m looking forward to seeing what you build. Feel free to reach out and share your experiences or ask any questions.
Top comments (0)