Many good things have either landed in 2018 in Pythonland, or have overcome their growing pains. Here are my personal favourites:
A Jupyter Notebook is a web application to execute Python (and other languages) and view the results in-line including graphs, prettified tables, and markdown-formatted prose. It also automatically saves intermediate results (similar to a REPL), allows exporting to many formats, and has a hundred other features. For a deeper dive, see my PyCon talk. Jupyter Notebooks are very widely used in the community, especially those in research and scientific fields. The Jupyter team very justifiably won the 2017 ACM Software System Award.
JupyterLab is an exciting improvement over traditional Jupyter notebooks. It includes some compelling features like cell drag-and-drop, inline viewing of data files (like CSV), a tabbed environment, and a more command-centered interface. It definitely still feels like a beta, with some glitches in Reveal.js slide export functionality and cell collapse not working as expected. But on the whole it’s a perfect example of a good tool getting even better and growing to fit the sophistication of its users.
mypy, a static type checking tool for Python, has existed for a while. However, it has gotten really good this year, to the point where you can integrate it into your production project as part of git hooks or other CI flow. I find it an extremely helpful addition to all codebases, catching the vast majority of my mistakes before I write a single line of test code. It’s not without pitfalls however. There are many cases where you have to make annotations that feel burdensome
__init__(self, *args) -> None
and other behaviour which I view as just strange. The lack of typeshed files for many common modules¹ such as:
continues to be an issue in integrating this into your CI system without significant configuration. The
— ignore-missing-imports option becomes basically mandatory. In the future, I hope that it becomes a community standard to provide typeshed files for all modules intended to be used as libraries.
I’m really excited about Pipfiles! Pipfiles are an implementation of PEP508, which motivates a replacement dependency-management system to
The top-level motivation is that dependency management with
requirements.txt seem to be well-known in the community, the closest article I’ve seen to an enumeration is this post. I recommend a read, but here is a TLDR:
There is no standard for
requirements.txt: is it an enumeration of all primary and secondary dependencies, or just the strict requirements? Does it include pinned versions? Additionally, splitting out development-time requirements is very ad-hoc². Different groups do different things, which makes reproducible builds a problem.
Keeping the list of dependencies up to date required
pip install $package followed by
pip freeze > requirements.txt, which was a really clunky workflow with a ton of problems.
The development-management ecosystem consists of three tools and standards (
requirements.txt) which do not interop cleanly. Since you’re trying to accomplish a single task, why isn’t there a single tool to help?
Pipenv creates a virtualenv automatically, installs and manages dependencies in that virtualenv, and keeps the
While the idea is great, using it is very cumbersome. I’ve run into many issues using it in practice and often have to fall back on the previous way of doing things — using an explicit virtualenv for example. I also found that locking is very slow (a problem partially stemming from the
setup.py standard, which is the source of many other issues in the tooling ecosystem).
f-strings are fantastic! Many others have written about the joy of f-strings, from their natural syntax to the performance improvements they bring. I see no reason to repeat these points, I just want to say it’s an amazing feature that I have been using regularly since they landed.
An annoyance they introduce is the dichotomy between writing
print statements and
logging statements. The
logging module is great, and by default does not format strings if that log message is turned off. So you might write:
x = 3
Which would print
x=3 if the log-level is set to
DEBUG, but would not even perform the string interpolation if the log-level is set higher. This is because
logging.debug is a function, and the strings are passed as arguments. You can see how it works in the very readable C source code. However, this functionality disappears if you write the following:
x = 3
The string interpolation happens regardless of log-level. This makes sense at a language-level, but the practical consequences are irritating in my natural workflow. I write
print statements first when debugging my code, and when it looks like everything is right I transform them into
logging statements later. So each print statement has to be manually rewritten to fit the different type of string interpolation. I don’t have a good idea of how to solve this problem, but I want to point it out as I haven’t seen anyone else write about this particular problem.