Jason C. McDonald

Posted on Jan 15, 2019 • Edited on Apr 27, 2022

Python Package Structure Dead Simple Python: Project Structure and Imports

#python #beginners #coding

Like the articles? Buy the book! Dead Simple Python by Jason C. McDonald is available from No Starch Press.

The worst part of tutorials is always their simplicity, isn't it? Rarely will you find one with more than one file, far more seldom with multiple directories.

I've found that structuring a Python project is one of the most often overlooked components of teaching the language. Worse, many developers get it wrong, stumbling through a jumble of common mistakes until they arrive at something that at least works.

Here's the good news: you don't have to be one of them!

In this installment of the Dead Simple Python series, we'll be exploring import statements, modules, packages, and how to fit everything together without tearing your hair out. We'll even touch on VCS, PEP, and the Zen of Python. Buckle up!

Setting Up The Repository

Before we delve into the actual project structure, let's address how this fits into our Version Control System [VCS]...starting with the fact you need a VCS! A few reasons are...

Tracking every change you make,
Figuring out exactly when you broke something,
Being able to see old versions of your code,
Backing up your code, and
Collaborating with others.

You've got plenty of options available to you. Git is the most obvious, especially if you don't know what else to use. You can host your Git repository for free on GitHub, GitLab, Bitbucket, or Gitote, among others. If you want something other than Git, there's dozens of other options, including Mercurial, Bazaar, Subversion (although if you use that last one, you'll probably be considered something of a dinosaur by your peers.)

I'll be quietly assuming you're using Git for the rest of this guide, as that's what I use exclusively.

Once you've created your repository and cloned a local copy to your computer, you can begin setting up your project. At minimum, you'll need to create the following:

README.md: A description of your project and its goals.
LICENSE.md: Your project's license, if it's open source. (See opensource.org for more information about selecting one.)
.gitignore: A special file that tells Git what files and directories to ignore. (If you're using another VCS, this file has a different name. Look it up.)
A directory with the name of your project.

That's right...our Python code files actually belong in a separate subdirectory! This is very important, as our repository's root directory is going to get mighty cluttered with build files, packaging scripts, virtual environments, and all manner of other things that aren't actually part of the source code.

Just for the sake of example, we'll call our fictional project awesomething.

PEP 8 and Naming

Python style is governed largely by a set of documents called Python Enhancement Proposals, abbreviated PEP. Not all PEPs are actually adopted, of course - that's why they're called "Proposals" - but some are. You can browse the master PEP index on the official Python website. This index is formally referred to as PEP 0.

Right now, we're mainly concerned with PEP 8, first authored by the Python language creator Guido van Rossum back in 2001. It is the document which officially outlines the coding style all Python developers should generally follow. Keep it under your pillow! Learn it, follow it, encourage others to do the same.

(Side Note: PEP 8 makes the point that there are always exceptions to style rules. It's a guide, not a mandate.)

Right now, we're chiefly concerned with the section entitled "Package and Module Names"...

Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability. Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.

We'll get to what exactly modules and packages are in a moment, but for now, understand that modules are named by filenames, and packages are named by their directory name.

In other words, filenames should be all lowercase, with underscores if that improves readability. Similarly, directory names should be all lowercase, without underscores if at all avoidable. To put that another way...

Do This: awesomething/data/load_settings.py
NOT This: awesomething/Data/LoadSettings.py

I know, I know, long-winded way to make a point, but at least I put a little PEP in your step. (Hello? Is this thing on?)

Packages and Modules

This is going to feel anticlimactic, but here are those promised definitions:

Any Python (.py) file is a module, and a bunch of modules in a directory is a package.

Well...almost. There's one other thing you have to do to a directory to make it a package, and that's to stick a file called __init__.py into it. You actually don't have to put anything into that file. It just has to be there.

There is other cool stuff you can do with __init__.py, but it's beyond the scope of this guide, so go read the docs to learn more.

If you do forget __init__.py in your package, it's going to do something much weirder than just failing, because that makes it an implicit namespace package. There's some nifty things you can do with that special type of package, but I'm not going into that here. As usual, you can learn more by reading the documentation: PEP 420: Implicit Namespace Packages.

So, if we look at our project structure, awesomething is actually a package, and it can contain other packages. Thus, we might call awesomething our top-level package, and all the packages underneath its subpackages. This is going to be really important once we get to importing stuff.

Let's look at one a snapshot of my real-world projects, omission, to get an idea of how we're structuring stuff...

omission-git
├── LICENSE.md
├── omission
│   ├── app.py
│   ├── common
│   │   ├── classproperty.py
│   │   ├── constants.py
│   │   ├── game_enums.py
│   │   └── __init__.py
│   ├── data
│   │   ├── data_loader.py
│   │   ├── game_round_settings.py
│   │   ├── __init__.py
│   │   ├── scoreboard.py
│   │   └── settings.py
│   ├── game
│   │   ├── content_loader.py
│   │   ├── game_item.py
│   │   ├── game_round.py
│   │   ├── __init__.py
│   │   └── timer.py
│   ├── __init__.py
│   ├── __main__.py
│   ├── resources
│   └── tests
│       ├── __init__.py
│       ├── test_game_item.py
│       ├── test_game_round_settings.py
│       ├── test_scoreboard.py
│       ├── test_settings.py
│       ├── test_test.py
│       └── test_timer.py
├── pylintrc
├── README.md
└── .gitignore

(In case you're wondering, I used the UNIX program tree to make that little diagram above.)

You'll see that I have one top-level package called omission, with four sub-packages: common, data, game, and tests. I also have the directory resources, but that only contains game audio, images, etc. (omitted here for brevity). resources is NOT a package, as it doesn't contain an __init__.py.

I also have another special file in my top-level package: __main__.py. This is the file that is run when we execute our top-level package directly via python -m omission. We'll talk about what goes in that __main__.py in a bit.

How import Works

If you've written any meaningful Python code before, you're almost certainly familiar with the import statement. For example...

import re

It is helpful to know that, when we import a module, we are actually running it. This means that any import statements in the module are also being run.

For example, re.py has several import statements of its own, which are executed when we say import re. That doesn't mean they're available to the file we imported re from, but it does mean those files have to exist. If (for some unlikely reason) enum.py got deleted on your environment, and you ran import re, it would fail with an error...

Traceback (most recent call last):
File "weird.py", line 1, in
import re
File "re.py", line 122, in
import enum
ModuleNotFoundError: No module named 'enum'

Naturally, reading that, you might get a bit confused. I've had people ask me why the outer module (in this example, re) can't be found. Others have wondered why the inner module (enum here) is being imported at all, since they didn't ask for it directly in their code. The answer is simple: we imported re, and that imports enum.

Of course, the above scenario is fictional: import enum and import re are never going to fail under normal circumstances, because both modules are part of Python's core library. It's just a silly example. ;)

Import Dos and Don'ts

There are actually a number of ways of importing, but most of them should rarely, if ever be used.

For all of the examples below, we'll imagine that we have a file called smart_door.py:

# smart_door.py
def close():
    print("Ahhhhhhhhhhhh.")

def open():
    print("Thank you for making a simple door very happy.")

Just for example, we will run the rest of the code in this section in the Python interactive shell, from the same directory as smart_door.py.

If we want to run the function open(), we have to first import the module smart_door. The easiest way to do this is...

import smart_door
smart_door.open()
smart_door.close()

We would actually say that smart_door is the namespace of open() and close(). Python developers really like namespaces, because they make it obvious where functions and whatnot are coming from.

(By the way, don't confuse namespace with implicit namespace package. They're two different things.)

The Zen of Python, also known as PEP 20, defines the philosophy behind the Python language. The last line has a statement that addresses this:

Namespaces are one honking great idea -- let's do more of those!

At a certain point, however, namespaces can become a pain, especially with nested packages. foo.bar.baz.whatever.doThing() is just ugly. Thankfully, we do have a way around having to use the namespace every time we call the function.

If we want to be able to use the open() function without constantly having to precede it with its module name, we can do this instead...

from smart_door import open
open()

Note, however, that neither close() nor smart_door.close() will not work in that last scenario, because we didn't import the function outright. To use it, we'd have to change the code to this...

from smart_door import open, close
open()
close()

In that terrible nested-package nightmare earlier, we can now say from foo.bar.baz.whatever import doThing, and then just use doThing() directly. Alternatively, if we want a LITTLE bit of namespace, we can say from foo.bar.baz import whatever, and say whatever.doThing().

The import system is deliciously flexible like that.

Before long, though, you'll probably find yourself saying "But I have hundreds of functions in my module, and I want to use them all!" This is the point at which many developers go off the rails, by doing this...

from smart_door import *

This is very, very bad! Simply put, it imports everything in the module directly, and that's a problem. Imagine the following code...

from smart_door import *
from gzip import *
open()

What do you suppose will happen? The answer is, gzip.open() will be the function that gets called, since that's the last version of open() that was imported, and thus defined, in our code. smart_door.open() has been shadowed - we can't call it as open(), which means we effectively can't call it at all.

Of course, since we usually don't know, or at least don't remember, every single function, class, and variable in every module that gets imported, we can easily wind up with a whole lot of messes.

The Zen of Python addresses this scenario as well...

Explicit is better than implicit.

You should never have to guess where a function or variable is coming from. Somewhere in the file should be code that explicitly tells us where it comes from. The first two scenarios demonstrate that.

I should also mention that the earlier foo.bar.baz.whatever.doThing() scenario is something Python developers do NOT like to see. Also from the Zen of Python...

Flat is better than nested.

Some nesting of packages is okay, but when your project starts looking like an elaborate set of Matryoshka dolls, you've done something wrong. Organize your modules into packages, but keep it reasonably simple.

Importing Within Your Project

That project file structure we created earlier is about to come in very handy. Recall my omission project...

omission-git
├── LICENSE.md
├── omission
│   ├── app.py
│   ├── common
│   │   ├── classproperty.py
│   │   ├── constants.py
│   │   ├── game_enums.py
│   │   └── __init__.py
│   ├── data
│   │   ├── data_loader.py
│   │   ├── game_round_settings.py
│   │   ├── __init__.py
│   │   ├── scoreboard.py
│   │   └── settings.py
│   ├── game
│   │   ├── content_loader.py
│   │   ├── game_item.py
│   │   ├── game_round.py
│   │   ├── __init__.py
│   │   └── timer.py
│   ├── __init__.py
│   ├── __main__.py
│   ├── resources
│   └── tests
│       ├── __init__.py
│       ├── test_game_item.py
│       ├── test_game_round_settings.py
│       ├── test_scoreboard.py
│       ├── test_settings.py
│       ├── test_test.py
│       └── test_timer.py
├── pylintrc
├── README.md
└── .gitignore

In my game_round_settings module, defined by omission/data/game_round_settings.py, I want to use my GameMode class. That class is defined in omission/common/game_enums.py. How do I get to it?

Because I defined omission as a package, and organized my modules into subpackages, it's actually pretty easy. In game_round_settings.py, I say...

from omission.common.game_enums import GameMode

This is called an absolute import. It starts at the top-level package, omission, and walks down into the common package, where it looks for game_enums.py.

Some developers come to me with import statements more like from common.game_enums import GameMode, and wonder why it doesn't work. Simply put, the data package (where game_round_settings.py lives) has no knowledge of its sibling packages.

It does, however, know about its parents. Because of this, Python has something called relative imports that lets us do the same thing like this instead...

from ..common.game_enums import GameMode

The .. means "this package's direct parent package", which in this case, is omission. So, the import steps back one level, walks down into common, and finds game_enums.py.

There's a lot of debate about whether to use absolute or relative imports. Personally, I prefer to use absolute imports whenever possible, because it makes the code a lot more readable. You can make up your own mind, however. The only important part is that the result is obvious - there should be no mystery where anything comes from.

(Continued Reading: Real Python - Absolute vs Relative Imports in Python

There is one other lurking gotcha here! In omission/data/settings.py, I have this line:

from omission.data.game_round_settings import GameRoundSettings

Surely, since both these modules are in the same package, we should be able to just say from game_round_settings import GameRoundSettings, right?

Wrong! It will actually fail to locate game_round_settings.py. This is because we are running the top-level package omission, which means the search path (where Python looks for modules, and in what order) works differently.

However, we can use a relative import instead:

from .game_round_settings import GameRoundSettings

In that case, the single . means "this package".

If you're familiar with the typical UNIX file system, this should start to make sense. .. means "back one level", and . means "the current location". Of course, Python takes it one step further: ... means "back two levels", .... is "back three levels", and so forth.

However, keep in mind that those "levels" aren't just plain directories, here. They're packages. If you have two distinct packages in a plain directory that is NOT a package, you can't use relative imports to jump from one to another. You'll have to work with the Python search path for that, and that's beyond the scope of this guide. (See the docs at the end of this article.)

`main.py`

Remember when I mentioned creating a __main__.py in our top-level package? That is a special file that is executed when we run the package directly with Python. My omission package can be run from the root of my repository with python -m omission.

Here's the contents of that file:

from omission import app

if __name__ == '__main__':
    app.run()

Yep, that's actually it! I'm importing my module app from the top-level package omission.

Remember, I could also have said from . import app instead. Alternatively, if I wanted to just say run() instead of app.run(), I could have done from omission.app import run or from .app import run. In the end, it doesn't make much technical difference HOW I do that import, so long as the code is readable.

(Side Note: We could debate whether it's logical for me to have a separate app.py for my main run() function, but I have my reasons...and they're beyond the scope of this guide.)

The part that confuses most folks at first is the whole if __name__ == '__main__' statement. Python doesn't have much boilerplate - code that must be used pretty universally with little to no modification - but this is one of those rare bits.

__name__ is a special string attribute of every Python module. If I were to stick the line print(__name__) at the top of omission/data/settings.py, when that module got imported (and thus run), we'd see "omission.data.settings" printed out.

When a module is run directly via python -m some_module, that module is assigned a special value of __name__: "main".

Thus, if __name__ == '__main__': is actually checking if the module is being executed as the main module. If it is, it runs the code under the conditional.

You can see this in action another way. If I added the following to the bottom of app.py...

if __name__ == '__main__':
    run()

...I can then execute that module directly via python -m omission.app, and the results are the same as python -m omission. Now __main__.py is being ignored altogether, and the __name__ of omission/app.py is "__main__.py".

Meanwhile, if I just run python -m omission, that special code in app.py is ignored, since its __name__ is now omission.app again.

See how that works?

Wrapping Up

Let's review.

Every project should use a VCS, such as Git. There are plenty of options to choose from.
Every Python code file (.py) is a module.
Organize your modules into packages. Each package must contain a special __init__.py file.
Your project should generally consist of one top-level package, usually containing sub-packages. That top-level package usually shares the name of your project, and exists as a directory in the root of your project's repository.
NEVER EVER EVER use * in an import statement. Before you entertain a possible exception, the Zen of Python points out "Special cases aren't special enough to break the rules."
Use absolute or relative imports to refer to other modules in your project.
Executable projects should have a __main__.py in the top-level package. Then, you can directly execute that package with python -m myproject.

Of course, there are a lot more advanced concepts and tricks we can employ in structuring a Python project, but we won't be discussing that here. I highly recommend reading the docs:

Thank you to grym, deniska (Freenode IRC #python), @cbrintnall, and @rhymes (Dev) for suggested revisions.

Top comments (56)

Sandor Dargo • Sep 28 '20

Great article, Jason

I cannot figure out a problem, maybe you can share your ideas.

I have the following structure:

.
├── cmake_project_creator
│   ├── dependency.py
│   ├── directory_factory.py
│   ├── directory.py
│   ├── include_directory.py
│   ├── __init__.py
│   ├── project_creator.py
│   ├── source_directory.py
│   └── test_directory.py
├── examples
│   ├── dual.json
│   ├── nested_dual.json
│   └── single.json
├── README.md
└── tests
    ├── __init__.py
    ├── test_dependency.py
    ├── test_directory_factory.py
    ├── test_directory.py
    ├── test_include_directory.py
    ├── test_project_creator.py
    ├── test_source_directory.py
    └── test_test_directory.py

The main entry point is cmake_project_creator/project_creator.py asking for a couple of parameters.

If I try to invoke it from Pycharm, everything is fine.
The tests running by nosetests --with-coverage --cover-erase running fine. But if I try to invoke cmake_project_creator/project_creator.py from the terminal, this is what I get:

sdargo@mymachine (master) ~/personal/dev/project_creator $ python cmake_project_creator/project_creator.py -c
Traceback (most recent call last):
  File "cmake_project_creator/project_creator.py", line 6, in <module>
    from cmake_project_creator import directory_factory
ModuleNotFoundError: No module named 'cmake_project_creator'

Do you have any idea what can be the issue?

Jason C. McDonald • Sep 28 '20

Absolutely. Your package needs a dedicated entry point for any imports off cmake_project_creator to work.

Add __main__.py to cmake_project_creator/. Your __main__.py file should look something like this:

from . import project_creator

if __name__ == "__main__":
    project_creator.WHATEVER_YOUR_INITIAL_FUNCTION_IS

Then, you can invoke the package directly with...

python3 -m cmake_project_creator

Sandor Dargo • Sep 28 '20

Thanks a lot, Jason! This partly solved my problem.

Now I can run for example python3 -m cmake_project_creator -c where -c is a parameter and it works like a charm. But after adding the correct shebang and execution rights, I still cannot simply run ./cmake_project_creator/project_creator.py -c as I have the same failure of :

Traceback (most recent call last):
  File "./cmake_project_creator/project_creator.py", line 8, in <module>
    from cmake_project_creator import directory_factory
ModuleNotFoundError: No module named 'cmake_project_creator'

Do I really have to manipulate sys.path for that?

Jason C. McDonald • Sep 28 '20 • Edited

As a rule, never ever ever ever EVER manipulate sys.path to solve Python import issues. It has some pretty serious and nasty side-effects that can break other programs.

You shouldn't invoke modules within a package like this. Instead, I'd recommend adding command-line argument support to your __main__.py, via argparse.

With __main__.py becoming the dedicated entry point, you should update it further to have a dedicated main() function, like this:

def main():
    # Do whatever....

if __name__ == '__main__':
    main()

The sole entry point to your package should be python3 -m cmake_project_creator, or an entry point script that invokes cmake_project_creator.main()

Sandor Dargo • Sep 28 '20

Ok, thanks. Yes, I've been already using argparse to get the CL arguments.
So one option is to use the -m option and the other way I managed to make it work is to add the repo-root to the PYTHONPATH, which could be done by a setup.py and most probably it would be OK to have it in a virtualenv.

Thanks once more!

Jason C. McDonald • Sep 28 '20

Well, like I said, changing the path is always wrong. Yes, even in a virtualenv, especially since you can't guarantee that it'll always be run in one by another user. So, you only have one option, being the one I described. But, shrug, I've said my piece.

Sandor Dargo • Sep 29 '20 • Edited

I got your point and at the same time, in general, I don't believe in "having only one option". My problem with invoking a product with -m is twofold. One, it's not at all user-friendly, and the other is that it's leaking an abstraction. The product is implemented as a module with that name.

Following your recommendation not to change any path variable, I found to way to overcome this.
1) I wrap the python3 -m cmake_project_creator into a shell script. As such users don't have to bother with -m, not even with pretending the module or script name with python3. On the other hand, it's not very portable (what about Win users for example?), this might or might not be acceptable. In my case, it would be.
2) I managed to invoke the module with runpy.run_module("cmake_project_creator", run_name='__main__') from another python script that given a correct shebang I can simply call ./run.py <args>. To me this seems ideal as I keep the invocation (from a user perspective) as simple as possible and as portable as possible and I encapsulate both the module name and the fact that the product is implemented as a module.

PS: The product is going to be completely free, with the word product I only want to emphasize that it's meant to be used by people who might not even know with python -m is or python at all.

Jason C. McDonald • Sep 29 '20

That's why you have an entry point script, or even several, as I alluded to earlier. You can use your setup.py to provide those, and those scripts can even be named such as you describe. But editing the Python path is still always wrong, for technical reasons.

Python quite often is meant to have only one right way of doing something. The language is built that way.

As I haven't yet been able to write the article on setup.py, please read this article by Chris Warrick: chriswarrick.com/blog/2014/09/15/p...

Sandor Dargo • Sep 30 '20

Thanks for the recommendation. I'm definitely going to read it as that's pretty much my next thing to do, understand what I need to put in the setup.py. Thanks again!

Ashley Hoff • Sep 19 '19 • Edited

Hi!
Firstly great article. This has been one of the clearest examples of how it should be done. Thanks.

I am not sure whether this is an edge case, but I have a structure that looks like this:

generateandsend/
├── __init__.py
├── __main__.py
│
├── generatedata/
│   ├── __init__.py
│   └── generate_data.py
│ 
├── senddata/
│   ├── __init__.py
│   └── send_data.py
│ 
├── utilities/
│   ├── __init__.py
│   └── local_utilities.py
│ 
├── Readme.md
└── License.md

Both generate_data.py and send_data.py reference functions in local_utilities.py.

The issue I have, more often then not, I would be calling send_data.py or generate_data.py

I know that if I call either them specifically, I will need to add a reference to be able to import local_utilities.

Does this go against the general accepted practice? Would it be better to either separate them into different projects (I would like to keep all the code together) or use an argparser in __main__ and call the respective module using args?

Thanks
Ashley

Jason C. McDonald • Sep 24 '19 • Edited

Hi Ashley,

Sorry for the tremendous delay in reply. So, just to be clear, you're wanting to be able to call generate_data.py and send_data.py directly, and those are supposed to be able to import a module from elsewhere in the project?

If so, I would actually consider why you want to execute those modules directly. If you're simply wanting to be able to execute the two separately from the command-line, it may be worth fleshing out __main__.py to accept a command-line argument, so python3 -m generateandsend send or python3 -m generateandsend generate will execute what you want. That'll also be the easiest solution. That way, you're always executing the top-level package (generateandsend)

In fact, I'm not entirely sure off the top of my head how to get multiple projects to talk to one another within a shared directory! I know it has to do with PYTHONPATH, but I think that will necessitate more research on my part. ;)

Ashley Hoff • Sep 24 '19

Thanks for replying (& no problem on the delay - we all have a life to live!).

I have thought about this more and agree - Why is it that I want to call them separately, where a parameter will suffice. So, I have abandoned the idea and gone with the python3 -m generateandsend generate approach

Cheers for the reply though. Appreciated.

Ashley Hoff • Sep 20 '19

I also have one more question - if I wanted to include an ini/configuration file in a resources folder, how would I import it?

Thanks

Jason C. McDonald • Sep 24 '19 • Edited

I like to put all such non-code files in a project subdirectory (not a package) called resources, and then use the built-in package pkg_resources to access it.

For example, in my omission project, the module omission/game/content_loader.py needs to load the text file omission/resources/content/content.txt. I do that with...

import pkg_resources

class ContentLoader(object):

    def __init__(self):
        """
        Open the file and load the contents in.
        """

        # ...

        path = pkg_resources.resource_filename(
            __name__,
            os.path.join(os.pardir, 'resources', 'content', 'content.txt')
        )

        with open(path, 'rt', encoding='utf-8') as content_file:
            raw_content = content_file.read()

        # ...

Simple as that!

P.S. If you find yourself needing to access files outside of your project directory, say, in the user's home directory, I recommend the package appdirs.

Ashley Hoff • Sep 24 '19

Again, thanks for the reply.

I've had a play with this one. Considering I am dealing with an ini file, it appears that configparser does what I want. This is the snippet I've come up with:

def conf():
    config = configparser.ConfigParser(converters={'list': lambda x: [i.strip() for i in x.split(',')]},
                                       allow_no_value=True)
    config.read('generateandsend/Resources/generateandsend.ini')
    section = config['test']
    string_a = section.get('StringA', None)
    string_b = section.get('StringB', None)

    return string_a, string_b

Is hard coding the relative path in that way frowned apon?

Jason C. McDonald • Sep 24 '19 • Edited

Is hard coding the relative path in that way frowned apon?

Most certainly, especially because you have to account for differences in path format between operating systems.

I'd recommend incorporating pkg_resources into your approach above.

def conf():
    config = configparser.ConfigParser(converters={'list': lambda x: [i.strip() for i in x.split(',')]},
                                       allow_no_value=True)
    path = pkg_resources.resource_filename(
            __name__,
            os.path.join(os.pardir, 'Resources', 'generateandsend.ini')
        )
    config.read(path)
    section = config['test']
    string_a = section.get('StringA', None)
    string_b = section.get('StringB', None)

    return string_a, string_b

I believe that will work? You'll have to check how config.read() handles an absolute path.

Ashley Hoff • Sep 24 '19

Beautiful. Thanks. I had to massage it a little and remove os.pardir, as it was giving me a false directory on my windows machine (C:\tmp\generateandsend\..\Resources\generateandsend.ini).

The resultant path variable now looks like:

path = pkg_resources.resource_filename(__name__, os.path.join('Resources', 'dummy.ini'))

I just need to test this on my Linux box

Cheers again. Send the bill to...... 😉

mkaut • Dec 11 '19

Great introduction.
I have one question: you have tests inside the project directory, while this guide places both docs and tests into the git root. Are there any up- or down-sides to either of the choices?

Jason C. McDonald • Dec 11 '19

My method just makes the imports a lot easier. You'll notice that the guide you linked to requires some complex imports for the tests to work, whereas my approach requires nothing of the sort, since tests are part of the module.

I suppose if you absolutely don't want to ship tests as part of your finished product, that might justify the other approach. That said, I prefer to always ship tests in the project; it makes debugging on another system a lot more feasible.

mkaut • Dec 16 '19

Good point, thanks.

So, in your approach, how do you import, let's say game_item.py from test_game_item.py?
And does it then have to be run from a specific folder (omission-git, omission-git/omission/, or omission-git/omission/tests) or does it work from all the above?

Jason C. McDonald • Dec 16 '19

Within omission/tests/test_game_item.py, I would import that other module via...

import omission.game.game_item

I always run python -m omission or pytest omission from within omission-git.

rhymes • Jan 15 '19

Hi Jason, nice article!

Just a question: I've noticed you didn't talk about namespace packages. Is it because it might be outside the scope of a "dead simple" intro?

I'm mentioning it because I believe they are a simpler concept for a new developer, as in: folders are packages, if you need initialization code for such package, add a __init__.py, otherwise you can't totally ignore the file. I'm over simplifying here of course.

Thank you!

Jason C. McDonald • Jan 15 '19

That was something I actually didn't know about. Thanks for the link! It is probably more advanced than I want to go in the article series, but thanks for parking it in a comment anyhow. I'll look at this again later, and see if it might be worth adding to the guide after all. Thank you!

rhymes • Jan 15 '19

An example:

➜ tree
.
└── smart_door
    └── open.py

1 directory, 1 file

➜  cat smart_door/open.py
print("I have opened")

➜  python
Python 3.7.2 (default, Jan 13 2019, 22:54:07)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> from smart_door import open
I have opened

You can read more about it here.

Grzegorz Krug • Sep 2 '19 • Edited

My tree example:

---src
init.py
main.py
------game
---------cards98.py

------reinforced
---------rl_agent.py

------supervised

Readme.md
License.md

I can not reach parent module, from rl_agent.py

I added some init.py but it does not solves.
I have tried:
from game.cards98 import GameCards98
from src.game.cards98 import GameCards98

And all I got is ModuleNotFoundError: No module named 'src'
This works fin in pycharm, but not in idle :/

Jason C. McDonald • Sep 2 '19 • Edited

This structure should work:

src/
├── __init__.py
├── __main__.py
│
├── game/
│   ├── __init__.py
│   └── cards98.py
│ 
├── reinforced/
│   ├── __init__.py
│   └── rl_agent.py
│ 
├── supervised/
│   └── __init__.py
│ 
├── Readme.md
└── License.md

You need __init__.py under each directory that you want to use as a package.

Then, from rl_agent.py, you should be able to use this import:

from src.game.cards98 import GameCards98

Grzegorz Krug • Sep 2 '19 • Edited

I know it should work, but it does not. I got
__init__.py everywhere and __main.py__ in top level.
Do I need to run it with -m param? I am definitely missing something. I was running scripts from top level to combine modules, but It can get messy sometimes :P

This is my repo: Github Cards98

Jason C. McDonald • Sep 2 '19

It shouldn't be messy. But, yes, you'd need to invoke your top-level package (not your top-level script.)

python3 -m src

By the by, I recommend renaming src to your project name, cards98, and then renaming the subpackage by the same name to something like game.

Grzegorz Krug • Sep 2 '19

Yes, now it is working. python -m cards98
Well... but only for invoking top level in console. It does not work for normal execution, like clicking 2 times __main.py__ with mouse.
This also makes debugging and testing harder, cause I have to change it always in __main__.py. Where can I use it? I think it just complicates everything.

Thanks for help in understanding this

Jason C. McDonald • Sep 2 '19 • Edited

This is, to my knowledge, the official (and only) way to structure a Python project. There are two to create an executable file to start everything.

Option 1: Native Script

Many Python projects offer a Bash script (on UNIX-like systems) or a Windows .bat file that will run the python3 -m cards98 command. I see this a lot.

Option 2: Python Script

This is the method I'd recommend, as it's the most portable.

Outside of your top-level package, you can write a separate Python script, such as run.py or cards98.py, and then use that to execute your main function.

For example, in cards98/__main__.py, you can put this...

def main():
    # The logic for starting your application.

if __name__ == "__main__":
    main()

And then, outside of the cards98 package, create the file cards98.py, with the following:

#!/usr/bin/env python3
from cards98.__main__ import main
main()

To start your Python application, just double-click cards98.py.

P.S. Thanks for bringing up this situation. I realized I never addressed it in the book!

Kyle R. Conway • Jan 15 '19

Thank you so much for this article. Hard to overstate how helpful this is for someone who feels relatively competent at the language but completely inexperienced at building something sane looking or structured appropriately.

Nikita Sobolev • Jan 16 '19 • Edited

Thanks for this article! It is very useful for beginners.

I would like to suggest to mention wemake-python-styleguide in one of the future articles. In my practice, it is very helpful for beginners, since it enforce insane rule to struct and clean your code. That's what stimulates learning progress!

Anyway, great series. Waiting for the next articles.

Marc Hanisch • Jul 25 '19 • Edited

Such a great article, thank you very much. I've just dived into Python, having used multiple languages before. But these are exactly the explanations needed by Python newcomers to get a better understanding how things work in Python.

Tony • May 9 '19

Nice article!

While this is probably beyond the scope of this article, one useful addition for those that need to create packages frequently would be to look into using cookiecutter. It lets you create a "package template". While these templates can be simple, they can also include support for many dev tools such as docker, travis-ci, sphinx, doctests (via pytest/nose/etc), etc.

Once the cookiecutter template is ready, you run a quick wizard and it generates the project directory/files for you. There are also a bunch of templates already available, some of which are specialized for specific tasks (such as data analysis).

For more info:
cookiecutter.readthedocs.io/en/lat...
github.com/audreyr/cookiecutter

Johann Krauter • Jan 24 '22

Hi together,

I have some "beauty" buggy behaviour with importing typing hints in my docstring. I'm using Sphinx with the intersphinx extension to build a docu based on the typing hints and docstring of my code.
By using the extension "intersphinx" and the intersphinx_mapping you can map "python, numpy, matplotlib" docu references in your docu.

In the screenshot you see: First parameter is a np.ndarray ("import numpy as np") type without any references. The second parameter has the references to the python docu.

When I import numpy as "import numpy" and type in the docstring numpy.ndarray, I get the docu references in the build html docu.

Can somebody example why it does not go with the np.ndarray typehint?

View full discussion (56 comments)

DEV Community

Python Package Structure Dead Simple Python: Project Structure and Imports

Setting Up The Repository

PEP 8 and Naming

Packages and Modules

How import Works

Import Dos and Don'ts

Importing Within Your Project

`main.py`

Wrapping Up

Top comments (56)

Option 1: Native Script

Option 2: Python Script

Read next

Suppressing "KeyboardInterrupt" Message on Python Script

Why Is Spark Slow??

Journey from 0 to DevRel

A beginner's guide to the Stable-Diffusion-Xl-Base-1.0 model by Stabilityai on Huggingface