This article gathers some of the practices I would like to see in the codebases I maintain and that I believe are not difficult to adopt. Obviously, it is impossible to demand that developers, or you demand of yourself, all good code practices. You can start with this initial set and gradually improve.
If you are looking for ways to optimize your code in terms of asymptotic complexity or anything related, this article will be of little use to you. However, if you want to improve the readability and organization of your Python projects, I strongly recommend reading it. You can adopt these practices in any type of project, not just in a specific niche like web development (which has some other good practices), data science, or any other area. It is important to consider that these practices should not override the code standards that your company/project already has, whether it's naming conventions, indentation, comments, or anything else. The overall goal is to provide you with ideas on how to keep your code readable and organized; you can modify the practices to your liking. Ultimately, what matters are the standards you adopt.
Exception Handling
This best practice might sound like a rule imposition, but it's really important that you take it into consideration.
Why did I include exception handling as a topic to be addressed in good Python coding practices when there are many others? Well, exception handling is particularly dangerous in many applications, and if not done correctly, you will likely infer unwanted behavior in your application. In short, it's much easier to mess up with exception handling than with other practices.
It's quite common to find the following code snippet in various Python projects:
try:
something
except Exception as e:
log.register('critical', e)
I was quite generous with the part of log.register; you're more likely to find a print(e).
This is extremely problematic for two reasons:
Firstly, it catches all exceptions that may occur in that block of code, and you probably don't want that because debugging errors will become infinitely more difficult. Moreover, it enables your code to do whatever it wants and still continue to execute. It also doesn't tell me anything about what might potentially happen. If you put this in your application, you did it thinking about a potential error, such as a UniqueViolation from psycopg2 or a TypeError, and therefore you should focus only on the errors you want to handle.
Secondly, it does nothing after catching the exception. In this case, it still logs an error (which is far from ideal), but in many applications with this type of snippet, it serves only to "not break" the code. However, it's always good for the rest of your application to know the result of that operation. Propagating the exception to higher layers or changing the behavior after a failure is a good practice.
To refactor this exception, we could do something like this:
try:
something
except psycopg2.errors.UniqueViolation as duplicate_error:
log.register('critical', duplicate_error)
raise ModifiedExceptionDatabase('Value exists in the database.')
except TypeError as type_error:
log.register('critical', type_error)
# do something
raise ModifiedExceptionType('Type mismatch.')
finally:
connection.close()
It's also important to remember that it's not good to have large code blocks or a very extensive flow inside try-except blocks. The larger the block, the more errors it catches, and the less visibility you have of your application.
In summary, you will want to avoid catching generic Exception
and instead handle only specific exceptions. After handling them, share the result of the operation with the rest of your application so that something can be done in higher layers.
To learn more about error handling: PEP 463 – Exception-catching expressions | peps.python.org
Code Styling and Standards
For code standards, there are several points you can address, such as naming conventions, abbreviations, common methods, among others. Speaking a bit about naming styles, Python's recommended styles are widely disseminated; it's almost common knowledge that snake case convention is used for function, method, and variable names, and Pascal case for classes. Although they are widely adopted by the community, they are far from irreplaceable. Now, if your team is accustomed to or already writes differently, it's completely acceptable, and this shouldn't be a problem for your codebase. However, if your team doesn't follow any standardization, whether in styling, naming, or tool usage, it's important that you adopt one. I strongly recommend that you write a document that is easily accessible to your team, containing the code standards you choose to adopt. To assist you, I'll list some points you can mention and give examples:
- Default Python version: 3.10
- Abbreviations. Ex: dt = date, nb = number
- Functions: Snake Case
- Methods: Snake Case
- Variables: Snake Case
- Classes: Pascal Case
- Modules: Kebab Case
- Spacing and indentation: Refer to PEP 8 (or any other)
- Database connection: ...
- Database operations: ...
- Library for migrations: ...
- Credential management: ...
There are many other points you can address with a code standard to facilitate maintenance, reduce errors, and increase the testability of your code. Obviously, it's impossible to standardize all your code, and I dare say that would be harmful. However, mapping the most repeated points and addressing them in a standardized way is a good software development practice.
Type Hints
Python is a language with dynamic typing, which means that data types are inferred at runtime. That is, you don't assign data types to each variable, class parameter, method, or function returns. Fortunately, it has type hints (available from Python 3.5) which are literally what the name suggests, hints about types. Type hints do not interfere with code execution but aid in readability and maintenance. Some tools like PyCharm and VSCode extensions provide highlights when you infer a type that does not match the type hint. To infer a type int
in your code, after the variable or parameter name, you put a colon :
followed by its type as int
. In function returns, you should use a dash followed by a right arrow ->
and the return type as str
. Let's illustrate with code:
persons: dict[str, int] = {"joe": 78}
age: int = persons.get("joe")
def is_legal_age(age: int) -> bool:
return age >= 18
In our code, we have a dictionary with a string key and an integer value, and the type hint dict[str, int]
exactly conveys this. You might think that in this case, a type hint could be more of a hindrance than a help since the variable's value is explicit in the code, and you are correct. But in cases where we can't see what value the variable assumes after code execution, type hints are extremely important to give context to your application.
In ideal applications, all variables would have type hints regardless of how the value is assigned to them. However, it's understandable that updating them whenever there's a change in the code or importing and inferring their types could be complicated. Therefore, I usually put type hints only on class constructors' parameters, functions, methods, and their returns.
It's important to remember that there are some tools like MyPy that serve as a "compiler"; it detects type errors using type hints and prevents problem execution until the error is fixed. It's not widely used, but if you encounter many type problems in your codebase, you may consider it as an option.
To learn more about type hints: PEP 484 – Type Hints | peps.python.org
Docstrings
Not only in Python but also in other languages, it's very common to have documentation for functions, classes, and methods in the code itself, as if they were comments. In Python, we call this documentation "docstrings," which are present in all built-in methods (functions integrated into the compiler) to community-made frameworks. It consists of adding a string below the initialization of your function, class, or method that succinctly explains some fundamental points such as what that function does, its parameters and their types, what that snippet returns, which exceptions it raises, and how it handles them, among other points. Let's visualize with a code example:
def sum(one: int, two: int) -> int:
"""A function that sums two values
Parameters:
one: First value to sum
two: Second value to sum
Raises:
TypeError: if one of the two values is not an integer
Returns:
The sum of the two values
"""
try:
return one + two
except TypeError:
raise TypeError("Both values must be integers")
The function in the example sums two integers. The docstring explains what the function does, the parameters it receives and what they do, explains which exception is handled, in which scenario, and summarizes what it returns. Of course, you can write the docstring to your liking; some people might disagree with how this one is written, but in the end, it fulfills its purpose of explaining the function's operation and also meets standardization criteria.
If you want to follow this pattern, it's important that you write your docstring within 6 double quotes, 3 at the beginning, a line break, and 3 at the end. Explain briefly what the function does in the first line/sentence, list the parameters separated by line breaks, include considerations you consider important such as exception handling, and finally, mention the return value, of course, separating all topics by blank lines. If the function doesn't have parameters, return values, or exceptions, it's okay to omit this information. Another consideration to make is that it's crucial to keep it updated; if you make a modification and don't update the docstring, it's preferable to delete the docstring than to leave misleading information.
Docstrings written in classes generate the __help__
method where, if you type print(Class.__help__)
, the docstring will be printed in the terminal. For functions and methods, you enable the built-in help, where if you type print(help(sum))
, it will also print the docstring. Additionally, it optimizes automatic documentation such as that of the Sphinx tool, which relies on docstrings for documentation generation.
Although it's a common consensus in the community that docstrings should be written for all classes, methods, and functions, it's understandable that this may not be possible due to a lack of practice leading to difficulty in creating readable documentation or due to lack of time to create and modify it. Therefore, something I have adopted is to write docstrings for functions that I consider too complex and would take too long to understand with just the code reading. But of course, this is something you should think about and consider before imposing it on your projects or team.
In the example of our code, which has an extremely simple function, in a real project, it would be ideal to assign a single-line docstring because it's a very obvious case. I exemplified with a more complex docstring because I believe it covers most cases of real projects. Anyway, docstring is a quite complex subject; it could easily fill an entire article on its own.
To learn more about docstrings: PEP 257 – Docstring Conventions | peps.python.org
Conclusion
Perhaps, like me, a good portion of the scripts you write are not influenced by others, and you may judge it unimportant to adopt these practices. However, in the future, when you write code for others, they will likely be very grateful to see these practices materialized in your code. Furthermore, the you of the future will thank the you of the past for writing readable code and adopting code standards. It's important to exercise these skills to become almost like muscle memory and be able to develop readable code more quickly. If you can apply this, you'll become a much better developer.
Note: I didn't mention code linters because I consider it a step further. They are great and almost mandatory in serious Python projects, but they require external configuration and this can be a problem for some people.
Top comments (0)