Python 3.9 is around the corner


The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider accepting the trial offer on the right. Thank you for visiting LWN.net!

By John Coggeshall
September 22, 2020

Python 3.9.0rc2 was released on September 17, with the final version scheduled for October 5, roughly a year after the release of Python 3.8. Python 3.9 will come with new operators for dictionary unions, a new parser, two string operations meant to eliminate some longstanding confusion, as well as improved time-zone handling and type hinting. Developers may need to do some porting for code coming from Python 3.8 or earlier, as the new release has removed several previously-deprecated features still lingering from Python 2.7.

Python 3.9 marks the start of a new release cadence. Up until now, Python has done releases on an 18-month cycle. Starting with Python 3.9, the language has shifted to an annual release cycle as defined by PEP 602 ("Annual Release Cycle for Python").

A table provided by the project shows how Python performance has changed in a number of areas since Python 3.4. It is interesting to note that Python 3.9 is worse than 3.8 on almost every benchmark in that table, though it does perform generally better than 3.7. That said, it is claimed that several Python constructs such as range, tuple, list, and dict will see improved performance in Python 3.9, though no specific performance benchmarks are given. The boost is credited to the language making more use of a fast-calling protocol for CPython that is described in PEP 590 ("Vectorcall: a fast calling protocol for CPython").

As the PEP explains, Vectorcall replaces the existing tp_call convention which has poor performance because it must create intermediate objects for a call. While CPython has special-case optimizations to speed up this process for calls to Python and built-in functions, those do not apply to classes or third-party extension objects. Additionally, tp_call does not provide a function pointer per object (only per class), again requiring the creation of several intermediate objects when making calls to classes. Vectorcall is faster because it does not have the same intermediate-object inefficiencies that are found in tp_call. Vectorcall was introduced in Python 3.8, but starting with version 3.9 it is used for the majority the Python calling conventions.

New operators and methods

Python 3.9 includes new dictionary union operators, | and |=, which we have previously covered; they are used to merge dictionaries. The | operator evaluates as a union of two dictionaries, while the |= operator stores the result of the union in the left-hand side of the operation:

 >>> z = {'a' : 1, 'b' : 2, 'c' : 3} >>> y = {'c' : 'foo', 'd' : 'bar' } >>> z | y {'a': 1, 'b': 2, 'c': 'foo', 'd': 'bar'} >>> z |= y >>> z {'a': 1, 'b': 2, 'c': 'foo', 'd': 'bar'}

There are many ways dictionaries can be merged in Python, but Andrew Barnert said that the operator is designed to address the "copying update":

The problem is the copying update. The only way to spell it is to store a copy in a temporary variable and then update that. Which you can’t do in an expression. You can do _almost_ the same thing with {**a, **b}, but not only is this ugly and hard to discover, it also gives you a dict even if a was some other mapping type, so it’s making your code more fragile, and usually not even doing so intentionally.

In situations where the two dictionaries share a common key, the last-seen value for a key "wins" and is included in the merge as shown above for key c. While the standard union operator | only allows unions between dict types, the assignment variety |= can be used to update a dictionary with new key-value pairs from an iterable object:

 >>> z = {'a' : 'foo', 'b' : 'bar', 'c' : 'baz'} >>> y = ((0, 0), (1, 1), (2, 8)) >>> z |= y >>> z {'a': 'foo', 'b': 'bar', 'c': 'baz', 0: 0, 1: 1, 2: 8}

PEP 584 ("Add Union Operators To dict") provides complete documentation of the new operators.

Two new string methods have also been added in version 3.9: removeprefix() and removesuffix(). These convenience methods make it easy to remove an unwanted prefix or suffix from string data. As described in PEP 616 ("String methods to remove prefixes and suffixes"), these functions are being added to address user confusion regarding the str.lstrip() and str.rstrip() methods, which are often mistaken as a means to remove a prefix or suffix from a string. The confusion around str.lstrip() and str.rstrip() comes from its optional string parameter. According to the PEP, the confusion for users stems from the fact that the parameter passed to str.lstrip() and str.rstrip() is interpreted as a set of individual characters to remove, rather than as a single substring. With the additions, the project hopes to provide a "cleaner redirection of users to the desired behavior." Using these new methods is straightforward, as shown below:

 >>> a = "PEP-616" >>> a.removeprefix("PEP-") '616'

Deprecation and porting

Developers should be aware of some features that are being deprecated and removed in 3.9, as well as some more deprecations that are coming in 3.10. Many Python 2.7 functions that emit a DeprecationWarning in version 3.8 "have been removed or will be removed soon" starting with version 3.9. The project recommends testing applications with the -W default command-line option, which will show these warnings, before upgrading. As we previously covered, certain backward-compatibility layers, such as the aliases to Abstract Base Classes in the collections module, will remain for one last release before being removed in Python 3.10. The complete listing of removals in version 3.9 is available for interested readers. Further, the release includes numerous new deprecations of language features that will be removed in a future release. An additional recommendation is to run tests in Python Development Mode using the -X dev option to prepare code bases for future changes.

Other goodies

As we reported, Python 3.9 ships with a new parsing expression grammar (PEG) parser to replace the current LL(1) parser in version 3.8. In PEP 617 ("New PEG parser for CPython") describing the change, the switch to the PEG parser will eliminate "multiple 'hacks' that exist in the current grammar to circumvent the LL(1)-limitation." This should help the project substantially reduce the maintenance cost for the parser.

Python introduced type hinting in version 3.5; the 3.9 release allows types like List and Dict to be replaced with the built-in list and dict varieties. Type hints in Python are mostly for linters and code checkers, as they are not enforced at run time by CPython. PEP 585 ("Type Hinting Generics In Standard Collections") provides a listing of collections that have become generics. Note that, with version 3.9, importing the types (from typing) that are now built-in is deprecated. It sounds like developers will have plenty of time to update their code, however, as according to the PEP: "the deprecated functionality will be removed from the typing module in the first Python version released 5 years after the release of Python 3.9.0."

Thanks to flexible function and variable annotations, as described in PEP 593 ("Flexible function and variable annotations"), Python 3.9 has a new Annotated type. This allows the decoration of existing types with context-specific metadata:

 charType = Annotated[int, ctype("char")]

This metadata can be used in either static analysis or at run time; it is ignored entirely if it is unused. It is designed to enable tools like mypy to perform static type checking and provides access to the metadata at run time via get_type_hints(). To provide backward compatibility with version 3.8, a new include_extras parameter has been added to the get_type_hints() function with a default value of False, retaining the same behavior as existed in version 3.8. When include_extras is set to True, get_type_hints() will return the defined Annotation type for use.

Various other language changes can be expected in Python 3.9. __import__() now raises ImportError instead of ValueError when a relative import went past the top-level package. Decorators have also been improved as described in PEP 614 ("Relaxing Grammar Restrictions On Decorators"), allowing any valid expression (defined as "anything that's valid as a test in if, elif, and while blocks") to be used to invoke them. In Python 3.8, the expressions available for use to invoke a decorator is limited. While the decorator grammar limitations "were rarely encountered in practice", according to the PEP, they occurred often enough over the years to be worth fixing in 3.9. The PEP has an example showing how PyQt5 currently works around the limitations.

Two new modules are provided as part of the Python 3.9 standard library: zoneinfo and graphlib. The zoneinfo module, which we have previously covered, provides support for the IANA time zone database and includes zoneinfo.ZoneInfo, which is a concrete datetime.tzinfo implementation allowing users to load time zone data identified by an IANA name. The graphlib module provides graphlib.TopologicalSorter, a class that implements topological sorting of graphs. In addition to these two new modules, many existing modules were improved in various ways. One notable change involves the asyncio module, which no longer supports the reuse_address parameter of asyncio.loop.create_datagram_endpoint() due to "significant security concerns." The bug report describes a problem when using SO_REUSEADDR on UDP in Linux environments. Setting SO_REUSEADDR allows multiple processes to listen on sockets for the same UDP port, which will pass incoming packets to each randomly; setting reuse_address to True in a Python script would enable this behavior.

There are a lot of interesting things worth checking out in Python 3.9, and the project's "What's new in Python 3.9" document is recommended for all the details. Additionally, the changelog provides an itemized list of changes between release candidates. Since no more release candidates of Python 3.9 are expected before the final version, developers may want to start testing their existing code to get a head start on the final release.


Index entries for this article
PythonReleases


Did you like this article? Please accept our trial subscription offer to be able to see more content like it and to participate in the discussion.

(Log in to post comments)