Skip to content

DOC: add small guide on how to write examples that pass doctests #20037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions doc/source/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,84 @@ the documentation are also built by Travis-CI. These docs are then hosted `here
<http://pandas-docs.github.io/pandas-docs-travis>`__, see also
the :ref:`Continuous Integration <contributing.ci>` section.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add sub-section refs

Writing examples that pass the doctest
--------------------------------------

The "Examples" section in the docstrings follows the `doctest <https://docs.python.org/3.6/library/doctest.html>`__
format (lines beginning with ``>>>`` for code), and are also tested to ensure the
correctness of the examples shown in the API documentation.

In general, the format looks like:

::

Examples
--------

A small example:

>>> s = pd.Series([1, 2, 3])
>>> s
0 1
1 2
2 3
dtype: int64

Taking the mean:

>>> s.mean()
2.0

A single docstring can be tested using the ``validate_docstrings.py`` script
(from inside the pandas repository), for example::

python scripts/validate_docstrings.py pandas.DataFrame.mean

Alternatively, you can use ``pytest --doctests-module pandas/core/series.py``
to run all doctests of the ``Series`` class.

Getting this tests to pass can sometimes be tricky. Here are some
attention points:

* Import all needed libraries (except for pandas and numpy, those are already
imported as `import pandas as pd` and `import numpy as np`) and define all
variables you use in the example.

* Try to avoid using random data.

* If you have a code snippet that wraps multiple lines, you need to use '...'
on the continued lines:

::

>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], index=['a', 'b', 'c'],
... columns=['A', 'B'])

* If you want to show a case where an exception is raised, you can do::

>>> pd.to_datetime(["712-01-01"])
Traceback (most recent call last):
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 712-01-01 00:00:00

It is essential to include the "Traceback (most recent call last):", but for
the actual error only the error name is sufficient.

* If there is a small part of the result that can vary (e.g. a hash in an object
represenation), you can use ``...`` to represent this part.

If you want to show that ``s.plot()`` returns a matplotlib AxesSubplot object,
this will fail the doctest::

>>> s.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7efd0c0b0690>

However, you can do (notice the comment that needs to be added)::

>>> s.plot() # doctest: +ELLIPSIS
<matplotlib.axes._subplots.AxesSubplot at ...>


.. _contributing.code:

Contributing to the code base
Expand Down