|Title:||Python local packages directory|
|Author:||Kushal Das <mail at kushaldas.in>, Steve Dower <steve.dower at python.org>, Donald Stufft <donald at stufft.io>, Nick Coghlan <ncoghlan at gmail.com>|
This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__ directory and prefer importing packages installed in this location over user or global site-packages. This will avoid the steps to create, activate or deactivate "virtual environments". Python will use the __pypackages__ from the base directory of the script when present.
Python virtual environments have become an essential part of development and teaching workflow in the community, but at the same time, they create a barrier to entry for many. The following are a few of the issues people run into while being introduced to Python (or programming for the first time).
- How virtual environments work is a lot of information for anyone new. It takes a lot of extra time and effort to explain them.
- Different platforms and shell environments require different sets of commands to activate the virtual environments. Any workshop or teaching environment with people coming with different operating systems installed on their laptops create a lot of confusion among the participants.
- Virtual environments need to be activated on each opened terminal. If someone creates/opens a new terminal, that by default does not get the same environment as in a previous terminal with virtual environment activated.
When the Python binary is executed, it attempts to determine its prefix (as stored in sys.prefix), which is then used to find the standard library and other key files, and by the site module to determine the location of the site-package directories. Currently the prefix is found -- assuming PYTHONHOME is not set -- by first walking up the filesystem tree looking for a marker file (os.py) that signifies the presence of the standard library, and if none is found, falling back to the build-time prefix hard coded in the binary. The result of this process is the contents of sys.path - a list of locations that the Python import system will search for modules.
This PEP proposes to add a new step in this process. If a __pypackages__ directory is found in the current working directory, then it will be included in sys.path after the current working directory and just before the system site-packages. This way, if the Python executable starts in the given project directory, it will automatically find all the dependencies inside of __pypackages__.
In case of Python scripts, Python will try to find __pypackages__ in the same directory as the script. If found (along with the current Python version directory inside), then it will be used, otherwise Python will behave as it does currently.
If any package management tool finds the same __pypackages__ directory in the current working directory, it will install any packages there and also create it if required based on Python version.
Projects that use a source management system can include a __pypackages__ directory (empty or with e.g. a file like .gitignore). After doing a fresh check out the source code, a tool like pip can be used to install the required dependencies directly into this directory.
The following shows an example project directory structure, and different ways the Python executable and any script will behave.
foo __pypackages__ 3.8 lib bottle myscript.py /> python foo/myscript.py sys.path == 'foo' sys.path == 'foo/__pypackages__/3.8/lib' cd foo foo> /usr/bin/ansible #! /usr/bin/env python3 foo> python /usr/bin/ansible foo> python myscript.py foo> python sys.path == '.' sys.path == './__pypackages__/3.8/lib' foo> python -m bottle
We have a project directory called foo and it has a __pypackages__ inside of it. We have bottle installed in that __pypackages__/3.8/lib, and have a myscript.py file inside of the project directory. We have used whatever tool we generally use to install bottle in that location.
For invoking a script, Python will try to find a __pypackages__ inside of the directory that the script resides_, /usr/bin. The same will happen in case of the last example, where we are executing /usr/bin/ansible from inside of the foo directory. In both cases, it will not use the __pypackages__ in the current working directory.
Similarly, if we invoke myscript.py from the first example, it will use the __pypackages__ `` directory that was in the ``foo directory.
If we go inside of the foo directory and start the Python executable (the interpreter), it will find the __pypackages__ directory inside of the current working directory and use it in the sys.path. The same happens if we try to use the -m and use a module. In our example, bottle module will be found inside of the __pypackages__ directory.
The above two examples are only cases where __pypackages__ from current working directory is used.
In another example scenario, a trainer of a Python class can say "Today we are going to learn how to use Twisted! To start, please checkout our example project, go to that directory, and then run python3 -m pip install twisted."
That will install Twisted into a directory separate from python3. There's no need to discuss virtual environments, global versus user installs, etc. as the install will be local by default. The trainer can then just keep telling them to use python3 without any activation step, etc.
This does not affect any older version of Python implementation.
Other Python implementations will need to replicate the new behavior of the interpreter bootstrap, including locating the __pypackages__ directory and adding it the sys.path just before site packages, if it is present.
Here is a PoC implementation (in the pypackages branch).
__pylocal__ and python_modules.