Python code structure¶
It is really very basic python package based on src-based approach. Aside of advantages mentioned under the provided link, this structure is preferred due to the following internal reasons:
src/
becomes standard interface for people to look foreasier for templating: tools can be easily configured to target
src/
- no need to configure on the fly or hardcode package namelater can evolved into multi module project.
Why python package¶
Having python package allows to much easier sharing and reusing code. It also reduces headache with hardcoding paths and with proper testing it can capture issues with missing or incompatible dependencies (as PIP does not solve versions with already installed packages.)
Tip
In one project we might work on training and inference code together with some frontend.
Python package can ship with code such that:
pip install magic[train]
installs all dependencies for training.
pip install magic
installs onnx + other inference code.
Then if you have FastAPI or Streamlit app you install lean package which reduces docker size. It also enables sharing with other users.
requirements-dev.txt
is used to keep track of packages used in development (developer use only).
The structure of python package is:
setup.cfg
- the most interesting file: declares package metadata, install dependencies etc.setup.py
- almost empty, it is used to ensurepip -e .
works due to legacy reasons.src/
directory for all code.any package name
__init__.py
__version__.py
VERSION
any python code
py.typed
- empty file signals to mypy that type annotations are present in python code.
MANIFEST.in
- optional, used to include non-python files in package (likepy.typed
or*.pyi
).
Tip
To create new package you can base on the files listed above.
Version in VERSION
file is recommended to be used together with bump_version.sh
script - see packaging page. This file is read by __version__.py
.
Python multi project¶
Usually, the above structure is enough. However, project might grow and require something more complex and the template does not support this case.
For example one can imagine a project with:
visual_processing
text_processing
Install one package, multiple modules¶
The first easiest approach is to add new python modules to src/
package like this:
src/
foo
<- (main module)bar
or
src/
package_name
__version__.py
VERSION
(…)
foo/
bar/
One package should be then a main with the same name as full package containing __version__.py
file.
Alternative and recommended way is to move VERSION
to project root and make setup.cfg
to use it (and remember to modify .bumpversion.cfg
file) - this would be a multi module python package.
pip install foo
would then result in valid code:
import foo
import bar
Install many packages, single module¶
If packages can be isolated then the best way is to move to structure:
components/
foo
src/
setup.cfg
setup.py
(…) etc.
bar
src/
(…)
Then one would call
pip install foo bar
given packages uploaded to some PIP registrypip install -e components/foo -e components/bar
for dev local install
To ensure quality one would also need to use nox/tox to test packages in isolation and without python registry it is a little cumbersome to use.
It is rather useful for a case when project have really independent modules but one have to use only single repository (monorepo case) and/or all packages are to be used together and are tied together in their release cycle.
Use multiple repositories¶
The last approach is best for really independent projects - is to basically have each package in its own repository.
This is the best clean approach, but then one have to ensure that release cycle and CI/CD is correctly setup. One can also used git submodule but this can lead to other problems but is easiest in short term especially if code is very stable.