Skip to content

BUG: Boxplot does not apply colors set by Matplotlib rcParams for certain plot elements #57709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
thetestspecimen opened this issue Mar 2, 2024 · 2 comments
Open
3 tasks done
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@thetestspecimen
Copy link
Contributor

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
df = pd.DataFrame(np.random.default_rng(12).random((10, 2)))
with mpl.rc_context({'boxplot.boxprops.color': 'red',
                     'boxplot.whiskerprops.color': 'green',
                     'boxplot.capprops.color': 'orange',
                     'boxplot.medianprops.color': 'cyan',
                     'patch.facecolor': 'grey'}):
    df.plot.box(patch_artist=True) # OR df.plot(kind='box', patch_artist=True)
    plt.show()

Issue Description

If the 'Reproducible Example' code is run it will result in the following:

pandas-example

If run directly through Matplotlib like so:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
df = pd.DataFrame(np.random.default_rng(12).random((10, 2)))
with mpl.rc_context({'boxplot.boxprops.color': 'red',
                     'boxplot.whiskerprops.color': 'green',
                     'boxplot.capprops.color': 'orange',
                     'boxplot.medianprops.color': 'cyan',
                     'patch.facecolor': 'grey'}):
    plt.boxplot(df, patch_artist=True)
    plt.show()

You end up with this:

matplotlib-example

As you can see Pandas completely ignores the rcParams assignment, and sets it's own colours.

I have only included in this example the exact elements (box, whiskers, caps, medians and box-face) that are ignored. It should also be noted that as the rcParams are ignored, Matplotlib stylesheets are also ignored if applied to these elements.

As Pandas does this only for these specific elements in the boxplot, it can result in some terrible looking plots if someone uses a comprehensive set of rcParams (or stylesheet) that have a significantly different set of colours.

A solution?

I have looked into where this occurs, and all the relevant code resides in:

https://github.com/pandas-dev/pandas/blob/main/pandas/plotting/_matplotlib/boxplot.py

Specifically, methods _get_colors() and _color_attrs(self). These two methods (among other bits of linked code) basically pick specific colours from the assigned colormap and apply them to the plot.

I know what needs adjusting, and could put in a PR. However, due to the nature of rcParams being the "default" and hence having the lowest priority in terms of application, I see no way to adjust the code without changing the current default colours (i.e. blue, and a green median taken from the "tab10" colormap).

That is why I am filing this 'bug', as I can see this change might be objectionable, and as such will require further discussion on the appropriate solution. The solution I am proposing, of using matplotlib rcParam defaults, would result in the following "default" plot:

matplotlib-default

My personal opinion is that this visual change is minor, and therefore should be implemented. I would also argue that accessibility is hindered by the current implementation (colour blindness being an example).

Items to note

While reviewing the code I noticed the following:

  1. BUG: Min/max markers on box plot are not visible with 'dark_background' theme #40769 is not completely solved as it was only fixed for the method plot.box and not boxplot (the two methods use different code within boxplot.py) - see line 376 of boxplot.py for the hardcoded black value for the caps using the method boxplot result = np.append(result, "k")
  2. the section of code refactored by color attribute of medianprops is not correctly understand in a boxplot #30346 does not distinguish between edgecolor and facecolor when patch_artist is set to True. This may or may not have been intentional, but should probably be separated out as it is the only reason patch.facecolor features this current bug report.

Expected Behavior

If colours are set in matplotlib rcParams (or stylesheets) by the user, they should be applied to the plot, not ignored.

Installed Versions

commit : 69f03a3
python : 3.10.13.final.0
python-bits : 64
OS : Linux
OS-release : 6.7.4-2-MANJARO
Version : #1 SMP PREEMPT_DYNAMIC Sat Feb 10 09:41:20 UTC 2024
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 3.0.0.dev0+448.g69f03a39ec
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0
setuptools : 69.1.1
pip : 24.0
Cython : 3.0.8
pytest : 8.0.2
hypothesis : 6.98.15
sphinx : 7.2.6
blosc : None
feather : None
xlsxwriter : 3.1.9
lxml.etree : 5.1.0
html5lib : 1.1
pymysql : 1.4.6
psycopg2 : 2.9.9
jinja2 : 3.1.3
IPython : 8.22.1
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : 1.3.8
fastparquet : 2024.2.0
fsspec : 2024.2.0
gcsfs : 2024.2.0
matplotlib : 3.8.3
numba : 0.59.0
numexpr : 2.9.0
odfpy : None
openpyxl : 3.1.2
pyarrow : 15.0.0
pyreadstat : 1.2.6
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2024.2.0
scipy : 1.12.0
sqlalchemy : 2.0.27
tables : 3.9.2
tabulate : 0.9.0
xarray : 2024.2.0
xlrd : 2.0.1
zstandard : 0.22.0
tzdata : 2024.1
qtpy : None
pyqt5 : None

@thetestspecimen thetestspecimen added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 2, 2024
@thetestspecimen thetestspecimen changed the title BUG: Boxplot does not apply colors set by Matplotlib rcParams for certain plot elements Mar 2, 2024
@thetestspecimen thetestspecimen changed the title Boxplot does not apply colors set by Matplotlib rcParams for certain plot elements BUG: Boxplot does not apply colors set by Matplotlib rcParams for certain plot elements Mar 2, 2024
@Axiself
Copy link

Axiself commented Mar 7, 2024

I'm interested in taking the bug once it is triaged, if possible as part of a university project.

@thetestspecimen
Copy link
Contributor Author

I'm interested in taking the bug once it is triaged, if possible as part of a university project.

As I mentioned in the bug report, I have already looked into the problem and have a potential solution ready and waiting in the form of a pull request (pending discussion and approval with the Pandas devs of course).

In my opinion, it might be better for the Pandas project to concentrate your efforts on a bug that is open, but has no fix offered? There are currently 3.6k issues open, so you should be spoilt for choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants