Skip to content

Sorting techniques edits #124701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 28, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 70 additions & 3 deletions Doc/howto/sorting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,14 @@ lists. In contrast, the :func:`sorted` function accepts any iterable.
Key Functions
=============

Both :meth:`list.sort` and :func:`sorted` have a *key* parameter to specify a
function (or other callable) to be called on each list element prior to making
The :meth:`list.sort` method and the functions :func:`sorted`,
:func:`min`, :func:`max`, :func:`heapq.nsmallest`, and
:func:`heapq.nlargest` have a *key* parameter to specify a function (or
other callable) to be called on each list element prior to making
comparisons.

For example, here's a case-insensitive string comparison:
For example, here's a case-insensitive string comparison using
:meth:`str.casefold`:

.. doctest::

Expand Down Expand Up @@ -272,6 +275,70 @@ to make it usable as a key function::

sorted(words, key=cmp_to_key(strcoll)) # locale-aware sort order

Strategies For Unorderable Types and Values
===========================================

A number of type and value issues can arise when sorting.
Here are some strategies that can help:

* Convert non-comparable input types to strings prior to sorting:

.. doctest::

>>> data = ['twelve', '11', 10]
>>> sorted(map(str, data))
['10', '11', 'twelve']

This is needed because most cross-type comparisons raise a
:exc:`TypeError`.

* Remove special values prior to sorting:

.. doctest::

>>> from math import isnan
>>> from itertools import filterfalse
>>> data = [3.3, float('nan'), 1.1, 2.2]
>>> sorted(filterfalse(isnan, data))
[1.1, 2.2, 3.3]

This is needed because the `IEEE-754 standard
<https://en.wikipedia.org/wiki/IEEE_754>`_ specifies that, "Every NaN
shall compare unordered with everything, including itself."

Likewise, ``None`` can be stripped from datasets as well:

.. doctest::

>>> data = [3.3, None, 1.1, 2.2]
>>> sorted(x for x in data if x is not None)
[1.1, 2.2, 3.3]

This is needed because ``None`` is not comparable to other types.

* Convert mapping types into sorted item lists before sorting:

.. doctest::

>>> data = [{'a': 1}, {'b': 2}]
>>> sorted(data, key=lambda d: sorted(d.items()))
[{'a': 1}, {'b': 2}]

This is needed because dict-to-dict comparisons raise a
:exc:`TypeError`.

* Convert set types into sorted lists before sorting:

.. doctest::

>>> data = [{'a', 'b', 'c'}, {'b', 'c', 'd'}]
>>> sorted(map(sorted, data))
[['a', 'b', 'c'], ['b', 'c', 'd']]

This is needed because the elements contained in set types do not have a
deterministic order. For example, ``list({'a', 'b'})`` may produce
either ``['a', 'b']`` or ``['b', 'a']``.

Odds and Ends
=============

Expand Down
Loading