Imagine you are a creator deity, designing a body for a creature. In your benevolence, you wish for the creature to evolve over time: first, because it must respond to changes in its environment, and second, because your wisdom grows and you think of better designs for the beast. It shouldn’t remain in the same body forever!
The creature, however, might be relying on features of its present anatomy. You can’t add wings or change its scales without warning. It needs an orderly process to adapt its lifestyle to its new body. How can you, as a responsible designer in charge of this creature’s natural history, gently coax it toward ever greater improvements?
It’s the same for responsible library maintainers. We keep our promises to the people who depend on our code: we release bugfixes and useful new features. We sometimes delete features if that’s beneficial for the library’s future. We continue to innovate, but we don’t break the code of people who use our library. How can we fulfill all those goals at once?
Add Useful Features
Your library shouldn’t stay the same for eternity: you should add features that your make your library better for your users. For example, if you have a Reptile class and it would be useful to have wings for flying, go for it.
class Reptile: @property def teeth(self): return 'sharp fangs' # If wings are useful, add them! @property def wings(self): return 'majestic wings'
But beware, features come with risk. Consider the following feature in the Python standard library, and let us see what went wrong with it.
bool(datetime.time(9, 30)) == True bool(datetime.time(0, 0)) == False
This is peculiar: converting any time object to a boolean yields True, except for midnight. (Worse, the rules for timezone-aware times are even stranger.) I’ve been writing Python for more than a decade but I didn’t discover this rule until last week. What kind of bugs can this odd behavior cause in users’ code?
Consider a calendar application with a function that creates events. If an event has an end time, the function requires it to also have a start time:
def create_event(day, start_time=None, end_time=None): if end_time and not start_time: raise ValueError("Can't pass end_time without start_time") # The coven meets from midnight until 4am. create_event(datetime.date.today(), datetime.time(0, 0), datetime.time(4, 0))
Unfortunately for witches, an event starting at midnight fails this validation. A careful programmer who knows about the quirk at midnight can write this function correctly, of course:
def create_event(day, start_time=None, end_time=None): if end_time is not None and start_time is None: raise ValueError("Can't pass end_time without start_time")
But this subtlety is worrisome. If a library creator wanted to make an API that bites users, a “feature” like the boolean conversion of midnight works nicely.
The responsible creator’s goal, however, is to make your library easy to use correctly.
This feature was written by Tim Peters when he first made the datetime module in 2002. Even founding Pythonistas like Tim make mistakes. The quirk was removed, and all times are True now.
# Python 3.5 and later. bool(datetime.time(9, 30)) == True bool(datetime.time(0, 0)) == True
Programmers who didn’t know about the oddity of midnight are saved from obscure bugs, but it makes me nervous to think about any code that actually relies on the weird old behavior and didn’t notice the change. It would have been better if this bad feature were never implemented at all. This leads us to the first promise of any library maintainer:
Avoid Bad Features
The most painful change to make is when you have to delete a feature. One way to avoid bad features is to add few features in general! Make no public method, class, function, or property without a good reason. Thus:
Features are like children: conceived in a moment of passion, they must be supported for years. Don’t do anything silly just because you can. Don’t add feathers to a snake!
But of course, there are plenty of occasions when users need something from your library that it does not yet offer. How do you choose the right feature to give them? Here’s another cautionary tale.
A Cautionary Tale From asyncio
As you may know, when you call a coroutine function, it returns a coroutine object:
async def my_coroutine(): pass print(my_coroutine())
<coroutine object my_coroutine at 0x10bfcbac8>
Your code must “await” this object to actually run the coroutine. It’s easy to forget this, so the asyncio developers wanted a “debug mode” that catches this mistake. Whenever a coroutine is destroyed without being awaited, the debug mode prints a warning with a traceback to the line where it was created.
When Yury Selivanov implemented the debug mode, he added as its foundation a “coroutine wrapper” feature. The wrapper is a function that takes in a coroutine and returns anything at all. Yury used it to install the warning logic on each coroutine, but someone else could use it to turn coroutines into the string “hi!”:
import sys def my_wrapper(coro): return 'hi!' sys.set_coroutine_wrapper(my_wrapper) async def my_coroutine(): pass print(my_coroutine())
That is one hell of a customization. It changes the very meaning of “async”. Calling set_coroutine_wrapper once will globally and permanently change all coroutine functions. It is, as Nathaniel Smith wrote, “a problematic API” which is prone to misuse and had to be removed. The asyncio developers could have avoided the pain of deleting the feature if they’d better shaped it to its purpose. Responsible creators must keep this in mind:
Keep Features Narrow
Luckily, Yury had the good judgment to mark this feature provisional, so asyncio users knew not to rely on it. Nathaniel was free to replace set_coroutine_wrapper with a narrower feature that only customized the traceback depth:
import sys sys.set_coroutine_origin_tracking_depth(2) async def my_coroutine(): pass print(my_coroutine())
<coroutine object my_coroutine at 0x10bfcbac8> RuntimeWarning:'my_coroutine' was never awaited Coroutine created at (most recent call last) File "script.py", line 8, in <module> print(my_coroutine())
This is much better. There’s no more global setting that can change coroutines’ type, so asyncio users need not code as defensively. Deities should all be as farsighted as Yury:
Mark Experimental Features "Provisional"
If you have merely a hunch that your creature wants horns and a quadruple-forked tongue, introduce the features but mark them “provisional”.
You might discover that the horns are adventitious but the quadruple-forked tongue is useful after all. In the next release of your library you can delete the former and mark the latter official.
No matter how wisely we guide our creature’s evolution, there may come a time when it’s best to delete an official feature. For example, you might have created a lizard, and now you choose to delete its legs. Perhaps you want to transform this awkward creature into a sleek and modern python.
There are two main reasons to delete features. First, you might discover a feature was a bad idea, through user feedback or your own growing wisdom. That was the case with the quirky behavior of midnight. Or, the feature might have been well-adapted to your library’s environment at first, but the ecology changes. Perhaps another deity invents mammals. Your creature wants to squeeze into their little burrows and eat the tasty mammal filling, so it has to lose its legs.
Similarly, the Python standard library deletes features in response to changes in the language itself. Consider asyncio’s Lock. It has been awaitable ever since “await” was added as a keyword:
lock = asyncio.Lock() async def critical_section(): await lock try: print('holding lock') finally: lock.release()
But now, we can do “async with lock”:
lock = asyncio.Lock() async def critical_section(): async with lock: print('holding lock')
The new style is much better! It’s short, and less prone to mistakes in a big function with other try-except blocks. Since “there should be one and preferably only one obvious way to do it” the old syntax is deprecated in Python 3.7 and it will be banned soon.
It’s inevitable that ecological change will have this effect on your code too, so learn to delete features gently. Before you do so, consider the cost or benefit of deleting it. Responsible maintainers are reluctant to make their users change a large amount of their code, or change their logic. (Remember how painful it was when Python 3 removed the “u” string prefix, before it was added back.) If the code changes are mechanical, however, like a simple search and replace, or if the feature is dangerous, it may be worth deleting.
Whether to Delete a Feature
|Code must change||Change is mechanical|
|Logic must change||Feature is dangerous|
In the case of our hungry lizard, we decide to delete its legs so it can slither into a mouse’s hole and eat it. How do we go about this? We could just delete the
walk method, changing code from this:
class Reptile: def walk(self): print('step step step')
class Reptile: def slither(self): print('slide slide slide')
That’s not a good idea, the creature is used to walking! Or, in terms of a library, your users have code that relies on the existing method. When they upgrade to the latest version of your library, their code will break.
# User's code. Oops! Reptile.walk()
Therefore responsible creators make this promise:
Delete Features Gently
There’s a few steps involved in deleting a feature gently. Starting with a lizard that walks with its legs, you first add the new method, “slither”. Next, deprecate the old method.
import warnings class Reptile: def walk(self): warnings.warn( "walk is deprecated, use slither", DeprecationWarning, stacklevel=2) print('step step step') def slither(self): print('slide slide slide')
The Python warnings module is quite powerful. By default it prints warnings to stderr, only once per code location, but you can silence warnings or turn them into exceptions, among other options.
As soon as you add this warning to your library, PyCharm and other IDEs render the deprecated method with a strikethrough. Users know right away that the method is due for deletion.
What happens when they run their code with the upgraded library?
> python3 script.py DeprecationWarning: walk is deprecated, use slither script.py:14: Reptile().walk() step step step
By default, they see a warning on stderr, but the script succeeds and prints “step step step”. The warning’s traceback shows what line of the user’s code must be fixed. (That’s what the “stacklevel” argument does: it shows the call site that users need to change, not the line in your library where the warning is generated.) Notice that the error message is instructive, it describes what a library user must do to migrate to the new version.
Your users will want to test their code and prove they call no deprecated library methods. Warnings alone won’t make unittests fail, but exceptions will. Python has a command-line option to turn deprecation warnings into exceptions:
> python3 -Werror::DeprecationWarning script.py Traceback (most recent call last): File "script.py", line 14, in <module> Reptile().walk() File "script.py", line 8, in walk DeprecationWarning, stacklevel=2) DeprecationWarning: walk is deprecated, use slither
Now, “step step step” is not printed, because the script terminates with an error.
So once you’ve released a version of your library that warns about the deprecated “walk” method, you can delete it safely in the next release. Right?
Consider what your library’s users might have in their projects’ requirements:
# User's requirements.txt has a dependency on the reptile package. reptile
The next time they deploy their code, they’ll install the latest version of your library. If they haven’t yet handled all deprecations then their code will break, because it still depends on “walk”. You need to be gentler than this. There are three more promises you must keep to your users: to maintain a changelog, choose a version scheme, and write an upgrade guide.
Maintain a Changelog
Your library must have a change log; its main purpose is to announce when a feature that your users rely on is deprecated or deleted.
Changes in Version 1.1
- New function Reptile.slither()
- Reptile.walk() is deprecated and will be removed in version 2.0, use slither()
Responsible creators use version numbers to express how a library has changed, so users can make informed decisions about upgrading. A “version scheme” is a language for communicating the pace of change.
Choose a Version Scheme
There are two schemes in widespread use, semantic versioning and time-based versioning. I recommend semantic versioning for nearly any library. The Python flavor thereof is defined in PEP 440, and tools like “pip” understand semantic version numbers.
If you choose semantic versioning for your library, you can delete its legs gently with version numbers like:
1.0: First “stable” release, with walk() 1.1: Add slither(), deprecate walk()
2.0: Delete walk()