Dependencies are a nightmare for many people. Some even argue they are technical debt. Managing the list of the libraries of your software is a horrible experience. Updating them — automatically? — sounds like a delirium.
Stick with me here as I am going to help you get a better grasp on something that you cannot, in practice, get rid of — unless you're incredibly rich and talented and can live without the code of others.
First, we need to be clear of something about dependencies: there are two types of them. Donald Stuff wrote better than I would about the subject years ago. To make it simple, one can say that they are two types of code packages depending on external code: applications and libraries.
Python libraries should specify their dependencies in a generic way. A library should not require
requests 2.1.5: it does not make sense. If every library out there needs a different version of
requests, they can't be used at the same time.
Libraries need to declare dependencies based on ranges of version numbers. Requiring
requests>=2 is correct. Requiring
requests>=1,<2 is also correct if you know that
requests 2.x does not work with the library. The problem that your version range specification is solving is the API compatibility issue between your code and your dependencies — nothing else. That's a good reason for libraries to use Semantic Versioning whenever possible.
Therefore, dependencies should be written in
setup.py as something like:
from setuptools import setup setup( name="MyLibrary", version="1.0", install_requires=[ "requests", ], # ... )
This way, it is easy for any application to use the library and co-exist with others.
An application is just a particular case of libraries. They are not intended to be reused (imported) by other libraries of applications — though nothing would prevent it in practice.
In the end, that means that you should specify the dependencies the same way that you would do for a library in the application's
The main difference is that an application is usually deployed in production to provide its service. Deployments need to be reproducible. For that, you can't solely rely on
setup.py: the requested range of the dependencies are too broad. You're at the mercy of random version changes at any time when re-deploying your application.
You, therefore, need a different version management mechanism to handle deployment than just
pipenv has an excellent section recapping this in its documentation. It splits dependency types into abstract and concrete dependencies: abstract dependencies are based on ranges (e.g., libraries) whereas concrete dependencies are specified with precise versions (e.g., application deployments) — as we've just seen here.
requirements.txt file has been used to solve application deployment reproducibility for a long time now. Its format is usually something like:
Each library sees itself specified to the micro version. That makes sure each of your deployment is going to install the same version of your dependency. Using a
requirements.txt is a simple solution and a first step toward reproducible deployment. However, it's not enough.
Indeed, while you can specify which version of
requests you want, if
requests depends on
urllib3, that could make
urllib 2.1 or
urllib 2.2. You can't know which one will be installed, which does not make your deployment 100% reproducible.
Of course, you could duplicate all
requests dependencies yourself in your
requirements.txt, but that would be madness!
There are various hacks available to fix this limitation, but the real saviors here are pipenv and poetry. The way they solve it is similar to many package managers in other programming languages. They generate a lock file that contains the list of all installed dependencies (and their own dependencies, etc.) with their version numbers. That makes sure the deployment is 100% reproducible.
Check out their documentation on how to set up and use them!
Now that you have your lock file that makes sure your deployment is reproducible in a snap, you've another problem. How do you make sure that your dependencies are up-to-date? There is a real security concern about this, but also bug fixes and optimizations that you might miss by staying behind.
If your project is hosted on GitHub, Dependabot is an excellent solution to solve this issue. Enabling this application on your repository creates automatically pull requests whenever a new version of the library listed in your lock file is available. For example, if you've deployed your application with
redis 3.3.6, Dependabot will create a pull request updating to
redis 3.3.7 as soon as it gets released. Furthermore, Dependabot supports
requirements.txt, pipenv, and poetry!
You're almost there. You have a bot that is letting you know that a new version of a library your project needs is available.
Once the pull request is created, your continuous integration system is going to kick in, deploy your project, and runs the test. If everything works fine, your pull request is ready to be merged. But are you really needed in this process?
Unless you have a particular and personal aversion on specific version numbers —"Gosh I hate versions that end with a 3. It's always bad luck."— or unless you have zero automated testing, you, human, is useless. This merge can be fully automatic.
This is where Mergify comes into play. Mergify is a GitHub application allowing to define precise rules about how to merge your pull requests. Here's a rule that I use in every project:
pull_requests_rules: - name: automatic merge from dependabot conditions: - author~=^dependabot(|-preview)\[bot\]$ - label!=work-in-progress - "status-success=ci/circleci: pep8" - "status-success=ci/circleci: py37" actions: merge: method: merge
As soon as your continuous integration system passes, Mergify merges the pull request for you.
You can then automatically trigger your deployment hooks to update your production deployment and get the new library version installed right away. This leaves your application always up-to-date with newer libraries and not lagging behind several years of releases.
If anything goes wrong, you're still able to revert the commit from Dependabot — which you can also automate if you wish with a Mergify rule.
This is to me the state of the art of dependency management lifecycle right now. And while this applies exceptionally well to Python, it can be applied to many other languages that use a similar pattern — such as Node and npm.