PEP:584
Title:Add + and - operators to the built-in dict class.
Author:Steven D'Aprano <steve at pearwood.info>
Status:Draft
Type:Standards Track
Created:01-Mar-2019
Post-History:

DRAFT -- This is a draft document for discussion.

This PEP suggests adding merge + and difference - operators to the built-in dict class.

The merge operator will have the same relationship to the dict.update method as the list concatenation operator has to list.extend, with dict difference being defined analogously.

Dict addition will return a new dict containing the left operand merged with the right operand.

>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> d + e
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> e + d
{'cheese': 3, 'aardvark': 'Ethel', 'spam': 1, 'eggs': 2}

The augmented assignment version operates in-place.

>>> d += e
>>> print(d)
{'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

Analogously with list addition, the operator version is more restrictive, and requires that both arguments are dicts, while the augmented assignment version allows anything the update method allows, such as iterables of key/value pairs.

>>> d + [('spam', 999)]
Traceback (most recent call last): ...
TypeError: can only merge dict (not "list") to dict
>>> d += [('spam', 999)]
>>> print(d)
{'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

Dict difference - will return a new dict containing the items from the left operand which are not in the right operand.

>>> d = {'spam': 1, 'eggs': 2, 'cheese': 3}
>>> e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
>>> d - e
{'spam': 1, 'eggs': 2}
>>> e - d
{'aardvark': 'Ethel'}

Augmented assignment will operate in place.

>>> d -= e
>>> print(d)
{'spam': 1, 'eggs': 2}

Like the merge operator and list concatenation, the difference operator requires both operands to be dicts, while the augmented version allows any iterable of keys.

>>> d - {'spam', 'parrot'}
Traceback (most recent call last): ...
TypeError: cannot take the difference of dict and set
>>> d -= {'spam', 'parrot'}
>>> print(d)
{'eggs': 2, 'cheese': 'cheddar'}
>>> d -= [('spam', 999)]
>>> print(d)
{'spam': 999, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}

For the merge operator, if a key appears in both operands, the last-seen value (i.e. that from the right-hand operand) wins. This shows that dict addition is not commutative, in general d + e will not equal e + d. This joins a number of other non-commutative addition operators among the builtins, including lists, tuples, strings and bytes.

Having the last-seen value wins makes the merge operator match the semantics of the update method, so that d + e is an operator version of d.update(e).

The error messages shown above are not part of the API, and may change at any time.

Rejected alternatives semantics for d + e include:

  • Add only new keys from e, without overwriting existing keys in d. This may be done by reversing the operands e + d, or using dict difference first, d + (e - d). The later is especially useful for the in-place version d += (e - d).

  • Raise an exception if there are duplicate keys. This seems unnecessarily restrictive and is not likely to be useful in practice. For example, updating default configuration values with user-supplied values would most often fail under the requirement that keys are unique:

    prefs = site_defaults + user_defaults + document_prefs
    
  • Add the values of d2 to the corresponding values of d1. This is the behaviour implemented by collections.Counter.

An alternative to the + operator is the pipe | operator, which is used for set union. This suggestion did not receive much support on Python-Ideas.

The + operator was strongly preferred on Python-Ideas.[1] It is more familiar than the pipe operator, matches nicely with - as a pair, and the Counter subclass already uses + for merging.

To create a new dict containing the merged items of two (or more) dicts, one can currently write:

{**d1, **d2}

but this is neither obvious nor easily discoverable. It is only guaranteed to work if the keys are all strings. If the keys are not strings, it currently works in CPython, but it may not work with other implementations, or future versions of CPython[2].

It is also limited to returning a built-in dict, not a subclass, unless re-written as MyDict(**d1, **d2), in which case non-string keys will raise TypeError.

There is currently no way to perform dict subtraction except through a manual loop.

The implementation will be in C. (The author of this PEP would like to make it known that he is not able to write the implementation.)

An approximate pure-Python implementation of the merge operator will be:

def __add__(self, other): if isinstance(other, dict): new = type(self)() # May be a subclass of dict. new.update(self) new.update(other) return new return NotImplemented
def __radd__(self, other): if isinstance(other, dict): new = type(other)() new.update(other) new.update(self) return new return NotImplemented

Note that the result type will be the type of the left operand; in the event of matching keys, the winner is the right operand.

Augmented assignment will just call the update method. This is analogous to the way list += calls the extend method, which accepts any iterable, not just lists.

def __iadd__(self, other):
self.update(other)

An approximate pure-Python implementation of the difference operator will be:

def __sub__(self, other): if isinstance(other, dict): new = type(self)() for k in self: if k not in other: new[k] = self[k] return new return NotImplemented
def __rsub__(self, other): if isinstance(other, dict): new = type(other)() for k in other: if k not in self: new[k] = other[k] return new return NotImplemented

Augmented assignment will operate on equivalent terms to update. If the operand has a key method, it will be used, otherwise the operand will be iterated over:

def __isub__(self, other): if hasattr(other, 'keys'): for k in other.keys(): if k in self: del self[k] else: for k in other: if k in self: del self[k]

These semantics are intended to match those of update as closely as possible. For the dict built-in itself, calling keys is redundant as iteration over a dict iterates over its keys; but for subclasses or other mappings, update prefers to use the keys method.

Attention!

The above paragraph may be inaccurate. Although the dict docstring states that keys will be called if it exists, this does not seem to be the case for dict subclasses. Bug or feature?

(Or when to avoid using these new operators.)

For merging multiple dicts, the d1 + d2 + d3 + d4 + ... idiom will suffer from the same unfortunate O(N**2) Big Oh performance as does list and tuple addition, and for similar reasons. If one expects to be merging a large number of dicts where performance is an issue, it may be better to use an explicit loop and in-place merging:

new = {}
for d in many_dicts: new += d

This is unlikely to be a problem in practice as most uses of the merge operator are expected to only involve a small number of dicts. Similarly, most uses of list and tuple concatenation only use a few objects.

Using the dict augmented assignment operators on a dict inside a tuple (or other immutable data structure) will lead to the same problem that occurs with list concatenation[3], namely the in-place addition will succeed, but the operation will raise an exception.

>>> a_tuple = ({'spam': 1, 'eggs': 2}, None)
>>> a_tuple[0] += {'spam': 999}
Traceback (most recent call last): ...
TypeError: 'tuple' object does not support item assignment
>>> a_tuple[0]
{'spam': 999, 'eggs': 2}

Similar remarks apply to the - operator.

Should these operators be part of the ABC Mapping API?

Source: https://github.com/python/peps/blob/master/pep-0584.txt