Skip to content

PLT: Cleaner plotting backend API, and unify Series and DataFrame accessors #27009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jul 3, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
f279d16
Some experiments so far
datapythonista Jun 21, 2019
ca5671c
Refactoring of pandas plotting to make the API clearer
datapythonista Jun 23, 2019
8a56ad7
Merge remote-tracking branch 'upstream/master' into plot_api
Jun 25, 2019
196388b
Addressing review comments, and fixing many tests (still some tests f…
Jun 25, 2019
f00ec30
Merge remote-tracking branch 'upstream/master' into plot_api
Jun 25, 2019
1c16faa
Merge remote-tracking branch 'upstream/master' into plot_api
Jun 26, 2019
995d72e
Restoring docstrings of hist_series, hist_frame, boxplot and boxplot_…
Jun 26, 2019
7e45996
Fixing plot accessor docstring (was in the wrong place, and couple of…
Jun 26, 2019
4fbfed0
Fixing hexbin plot tests
Jun 26, 2019
7d7263a
Fixing bug when calling plot twice on the same data, since the data (…
Jun 26, 2019
1a03cbf
Raising missing exception for pie in DataFrame, and fixing accessor s…
Jun 26, 2019
cf7cbc0
Fixing bug that shown the legend for Series plot
Jun 26, 2019
d063e05
Fix linting
Jun 26, 2019
c83551d
Merge remote-tracking branch 'upstream/master' into plot_api
Jun 28, 2019
369fbf1
Merge remote-tracking branch 'upstream/master' into plot_api
Jul 1, 2019
0d146f9
Fixing bug that made reusing the previous plot for dataframes
Jul 1, 2019
34ea1f2
Removing duplicated data type checks
Jul 1, 2019
19489ba
Restoring original position of methods, so the diff is smaller
Jul 1, 2019
9b4fc6d
Fixing name of reuse_plot parameter
Jul 1, 2019
2597bc9
Fixing bug with matplotlib 2
Jul 1, 2019
57c4937
Adding documentation and improving comments, based on Jeff review
Jul 2, 2019
263ee7a
Adding FutureWarning if Series.plot is called with positional arguments
Jul 2, 2019
37fe165
Not passing default matplotlib parameters to backends (all known kwar…
Jul 2, 2019
0cf4514
Fixing test of plotting accessor parameters
Jul 2, 2019
4d70d5d
Temporary not warning for Series.plot positional arguments (looks lik…
Jul 2, 2019
a2330b2
Revert "Temporary not warning for Series.plot positional arguments (l…
Jul 2, 2019
42a1b35
Merge remote-tracking branch 'upstream/master' into plot_api
Jul 2, 2019
34d189f
Adding debug info in the CI for failing test
Jul 2, 2019
5819585
Revert "Adding debug info in the CI for failing test"
Jul 2, 2019
29d7547
Temporary removing the warning, to see if it's causing the andrews_cu…
Jul 2, 2019
37fb064
Merge branch 'master' into PR_TOOL_MERGE_PR_27009
jreback Jul 3, 2019
a5d0fd9
Revert "Temporary removing the warning, to see if it's causing the an…
datapythonista Jul 3, 2019
ce544e1
Removing test that causes parallel_coordinates test to fail
datapythonista Jul 3, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions doc/source/development/extending.rst
Original file line number Diff line number Diff line change
Expand Up @@ -416,3 +416,30 @@ Below is an example to define two original properties, "internal_cache" as a tem
# properties defined in _metadata are retained
>>> df[['A', 'B']].added_property
property

.. _extending.plotting-backends:

Plotting backends
-----------------

Starting in 0.25 pandas can be extended with third-party plotting backends. The
main idea is letting users select a plotting backend different than the provided
one based on Matplotlib. For example:

.. code-block:: python

>>> pd.set_option('plotting.backend', 'backend.module')
>>> pd.Series([1, 2, 3]).plot()

This would be more or less equivalent to:

.. code-block:: python

>>> import backend.module
>>> backend.module.plot(pd.Series([1, 2, 3]))

The backend module can then use other visualization tools (Bokeh, Altair,...)
to generate the plots.

More information on how to implement a third-party plotting backend can be found at
https://github.com/pandas-dev/pandas/blob/master/pandas/plotting/__init__.py#L1.
2 changes: 1 addition & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -7946,7 +7946,7 @@ def isin(self, values):

# ----------------------------------------------------------------------
# Add plotting methods to DataFrame
plot = CachedAccessor("plot", pandas.plotting.FramePlotMethods)
plot = CachedAccessor("plot", pandas.plotting.PlotAccessor)
hist = pandas.plotting.hist_frame
boxplot = pandas.plotting.boxplot_frame
sparse = CachedAccessor("sparse", SparseFrameAccessor)
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -4518,7 +4518,7 @@ def to_period(self, freq=None, copy=True):
str = CachedAccessor("str", StringMethods)
dt = CachedAccessor("dt", CombinedDatetimelikeProperties)
cat = CachedAccessor("cat", CategoricalAccessor)
plot = CachedAccessor("plot", pandas.plotting.SeriesPlotMethods)
plot = CachedAccessor("plot", pandas.plotting.PlotAccessor)
sparse = CachedAccessor("sparse", SparseAccessor)

# ----------------------------------------------------------------------
Expand Down
71 changes: 63 additions & 8 deletions pandas/plotting/__init__.py
Original file line number Diff line number Diff line change
@@ -1,18 +1,73 @@
"""
Plotting public API
Plotting public API.

Authors of third-party plotting backends should implement a module with a
public ``plot(data, kind, **kwargs)``. The parameter `data` will contain
the data structure and can be a `Series` or a `DataFrame`. For example,
for ``df.plot()`` the parameter `data` will contain the DataFrame `df`.
In some cases, the data structure is transformed before being sent to
the backend (see PlotAccessor.__call__ in pandas/plotting/_core.py for
the exact transformations).

The parameter `kind` will be one of:

- line
- bar
- barh
- box
- hist
- kde
- area
- pie
- scatter
- hexbin

See the pandas API reference for documentation on each kind of plot.

Any other keyword argument is currently assumed to be backend specific,
but some parameters may be unified and added to the signature in the
future (e.g. `title` which should be useful for any backend).

Currently, all the Matplotlib functions in pandas are accessed through
the selected backend. For example, `pandas.plotting.boxplot` (equivalent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should mention those right now, as there is still some discussion on whether to see them as part of the plotting backend interface or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two lines below I say that This is expected to change. Personally I think it adds more value to have this comment now, and remove it when/if that changes, than just not having this documented, or document the future behavior.

to `DataFrame.boxplot`) is also accessed in the selected backend. This
is expected to change, and the exact API is under discussion. But with
the current version, backends are expected to implement the next functions:

- plot (describe above, used for `Series.plot` and `DataFrame.plot`)
- hist_series and hist_frame (for `Series.hist` and `DataFrame.hist`)
- boxplot (`pandas.plotting.boxplot(df)` equivalent to `DataFrame.boxplot`)
- boxplot_frame and boxplot_frame_groupby
- tsplot (deprecated)
- register and deregister (register converters for the tick formats)
- Plots not called as `Series` and `DataFrame` methods:
- table
- andrews_curves
- autocorrelation_plot
- bootstrap_plot
- lag_plot
- parallel_coordinates
- radviz
- scatter_matrix

Use the code in pandas/plotting/_matplotib.py and
https://github.com/pyviz/hvplot as a reference on how to write a backend.

For the discussion about the API see
https://github.com/pandas-dev/pandas/issues/26747.
"""
from pandas.plotting._core import (
FramePlotMethods, SeriesPlotMethods, boxplot, boxplot_frame,
boxplot_frame_groupby, hist_frame, hist_series)
PlotAccessor, boxplot, boxplot_frame, boxplot_frame_groupby, hist_frame,
hist_series)
from pandas.plotting._misc import (
andrews_curves, autocorrelation_plot, bootstrap_plot,
deregister as deregister_matplotlib_converters, lag_plot,
parallel_coordinates, plot_params, radviz,
register as register_matplotlib_converters, scatter_matrix, table)

__all__ = ['boxplot', 'boxplot_frame', 'boxplot_frame_groupby', 'hist_frame',
'hist_series', 'FramePlotMethods', 'SeriesPlotMethods',
'scatter_matrix', 'radviz', 'andrews_curves', 'bootstrap_plot',
'parallel_coordinates', 'lag_plot', 'autocorrelation_plot',
'table', 'plot_params', 'register_matplotlib_converters',
__all__ = ['PlotAccessor', 'boxplot', 'boxplot_frame', 'boxplot_frame_groupby',
'hist_frame', 'hist_series', 'scatter_matrix', 'radviz',
'andrews_curves', 'bootstrap_plot', 'parallel_coordinates',
'lag_plot', 'autocorrelation_plot', 'table', 'plot_params',
'register_matplotlib_converters',
'deregister_matplotlib_converters']
Loading