-
Notifications
You must be signed in to change notification settings - Fork 185
Cars MPG example plot #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… by number of cylinders The bins are sorted by decreasing r, so that they are all visible. The example would benefit from stackR (#197). It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test!
Co-authored-by: Mike Bostock <[email protected]>
Co-authored-by: Mike Bostock <[email protected]>
If you switch x to a point scale here, you’ll notice something interesting which is that the Plot.binX transform for the line is generating more bins than there are years: I think what’s happening here is that the bin transform is generating half-year intervals [[70, 70.5], [70.5, 71], …] and then Plot.binX computes the midpoint of each interval [70.25, 70.75, …] but then only half of the bins have any data, so the line is defined at [70.25, 71.25, 72.25, …] whereas the dots are drawn on integers [70, 71, 72, …]. Therefore I think it’s more appropriate to use Plot.groupX for the line here, rather than Plot.binX: Plot.line(data,
Plot.groupX({y: "mean"}, {
x: "year",
y: "economy (mpg)",
stroke: "cylinders",
curve: "basis"
})
) If you use Plot.groupX, it doesn’t make much of a difference whether you use a linear or point x-scale (since there are no gaps in the x-domain here). |
Also, since the dots are overlapping here, I feel it’s desirable to minimize occlusion by using a stroke instead of a fill. And then you don’t need the sort and reverse transforms, so the code is simpler to boot. 🙂 Plot.dot(data, Plot.binY({r: "count"}, {
x: "year",
y: "economy (mpg)",
stroke: "cylinders",
thresholds: 20
})) |
The documentation mentions two things: In the Plot.dot section:
In the Plot.group section
(please feel free to suggest a better wording) Here is an example:
|
I saw it in one of the commits. The relevant code is here... Plot.dot(data,
Plot.reverse(
Plot.sort(
"length",
Plot.binY(
{ r: "count" },
{
x: "year",
y: "economy (mpg)",
fill: "cylinders",
thresholds: 20
}
)
)
)
) |
I've added that example to the notebook https://observablehq.com/@fil/plot-group-and-sort |
This example plot computes the median of cars' economy (mpg), grouped by number of cylinders
The bins are sorted by decreasing r, so that they are all visible (using Plot.sort, not #334 / #349!).
The example would benefit from stackR (#197).
It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the z values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).
It's also crying for a color legend (#23).