Skip to content

DOC: clarify boxplot whiskers extension #34070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 21, 2020
Merged

DOC: clarify boxplot whiskers extension #34070

merged 3 commits into from
May 21, 2020

Conversation

neutrinoceros
Copy link
Contributor

As I'm getting started with box plots, I found confusing that the DataFrame.boxplot docstring didn't mention that whiskers extend at most to 1.5 * IQR away from the box's edges.

  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry -> is it required for such a minor change ?

@neutrinoceros neutrinoceros changed the title doc: clarify boxplot whisker's extension DOC: clarify boxplot whisker's extension May 8, 2020
@neutrinoceros neutrinoceros changed the title DOC: clarify boxplot whisker's extension DOC: clarify boxplot whiskers extension May 8, 2020
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neutrinoceros , agreed that this clarifies things

Co-authored-by: Marco Gorelli <[email protected]>
@joooeey
Copy link
Contributor

joooeey commented May 11, 2020

This doc change is sorely needed. Thanks for doing it! However, your explanation is wrong. The whiskers only ever extend to 1.5*IQR if there happens to be a data point exactly there. Here's how it actually works: https://stackoverflow.com/a/49781675/4691830 .

I suggest the following wording:

By default, they extend no more than `1.5 * IQR (IQR = Q3 - Q1)` from the edges of the box, ending at the farthest data point within that interval. Outliers are plotted as separate dots.

@neutrinoceros
Copy link
Contributor Author

Thanks @joooeey for clarifying this, I applied your suggestion !

@MarcoGorelli
Copy link
Member

However, your explanation is wrong

@joooeey are you sure it's wrong? When the OP wrote

they extend up to 1.5 * IQR (IQR = Q3 - Q1) from the edges of the box, or to the farthest
data point if it lies within this interval

I think they meant that the whiskers only extend there if there's a point there, which I think is what you're saying.

I suggest the following wording:

Having said that, thanks for this, I do think your wording is very clear

@MarcoGorelli
Copy link
Member

@neutrinoceros did you push it? :) I don't (yet) see anything here

@neutrinoceros
Copy link
Contributor Author

neutrinoceros commented May 11, 2020

I think they meant that the whiskers only extend there if there's a point there, which I think is what you're saying.

I believe my own wording was correct if you read the "or" and the "within that interval" as an inclusive, which showed my own misunderstanding.

@neutrinoceros did you push it? :) I don't (yet) see anything here

Oops, you're right, the push failed for some reason.

@WillAyd WillAyd added the Docs label May 21, 2020
@WillAyd WillAyd added this to the 1.1 milestone May 21, 2020
@WillAyd WillAyd merged commit 3fd150c into pandas-dev:master May 21, 2020
@WillAyd
Copy link
Member

WillAyd commented May 21, 2020

Great thanks @neutrinoceros

PuneethaPai pushed a commit to PuneethaPai/pandas that referenced this pull request May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants