Skip to content

Cars MPG example plot #496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 11, 2021
Merged

Cars MPG example plot #496

merged 7 commits into from
Aug 11, 2021

Conversation

Fil
Copy link
Contributor

@Fil Fil commented Aug 11, 2021

This example plot computes the median of cars' economy (mpg), grouped by number of cylinders

Capture d’écran 2021-08-11 à 12 21 54

The bins are sorted by decreasing r, so that they are all visible (using Plot.sort, not #334 / #349!).

The example would benefit from stackR (#197).

It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the z values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).

It's also crying for a color legend (#23).

… by number of cylinders

The bins are sorted by decreasing r, so that they are all visible.

The example would benefit from stackR (#197).

It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).
@Fil Fil requested a review from mbostock August 11, 2021 10:23
Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test!

@mbostock
Copy link
Member

If you switch x to a point scale here, you’ll notice something interesting which is that the Plot.binX transform for the line is generating more bins than there are years:

Screen Shot 2021-08-11 at 7 57 18 AM

I think what’s happening here is that the bin transform is generating half-year intervals [[70, 70.5], [70.5, 71], …] and then Plot.binX computes the midpoint of each interval [70.25, 70.75, …] but then only half of the bins have any data, so the line is defined at [70.25, 71.25, 72.25, …] whereas the dots are drawn on integers [70, 71, 72, …].

Therefore I think it’s more appropriate to use Plot.groupX for the line here, rather than Plot.binX:

Plot.line(data,
  Plot.groupX({y: "mean"}, {
    x: "year",
    y: "economy (mpg)",
    stroke: "cylinders",
    curve: "basis"
  })
)

Screen Shot 2021-08-11 at 8 02 55 AM

If you use Plot.groupX, it doesn’t make much of a difference whether you use a linear or point x-scale (since there are no gaps in the x-domain here).

@mbostock
Copy link
Member

Also, since the dots are overlapping here, I feel it’s desirable to minimize occlusion by using a stroke instead of a fill. And then you don’t need the sort and reverse transforms, so the code is simpler to boot. 🙂

Plot.dot(data, Plot.binY({r: "count"}, {
  x: "year",
  y: "economy (mpg)",
  stroke: "cylinders",
  thresholds: 20
}))

Screen Shot 2021-08-11 at 8 09 25 AM

@mbostock mbostock merged commit ca9d62b into main Aug 11, 2021
@mbostock mbostock deleted the fil/cars-mpg-example branch August 11, 2021 15:15
@reubano
Copy link

reubano commented May 4, 2022

Can you please add this info on using Plot.sort to sort binned data on its z axis to the readme? It's mentioned but I had no idea how to actually use it until finding this PR.

CR #439 #472

@Fil
Copy link
Contributor Author

Fil commented May 7, 2022

The documentation mentions two things:

In the Plot.dot section:

Dots are drawn in input order, with the last data drawn on top. If sorting is needed, say to mitigate overplotting by drawing the smallest dots on top, consider a sort and reverse transform.

In the Plot.group section

By default, all (non-empty) groups are generated in ascending natural order.

(please feel free to suggest a better wording)

Here is an example:

Plot.dot(
  { length: 10000 },
  Plot.group(
    { r: "count", sort: "count", reverse: false },
    {
      x: d3.randomBinomial(40, 0.5),
      y: d3.randomBinomial(30, 0.5),
      fill: "red",
      stroke: "white"
    }
  )
).plot()

untitled (21)

@reubano
Copy link

reubano commented May 11, 2022

I saw it in one of the commits. The relevant code is here...

Plot.dot(data,
  Plot.reverse(
    Plot.sort(
      "length", 
      Plot.binY(
        { r: "count" }, 
        {
          x: "year",
          y: "economy (mpg)",
          fill: "cylinders",
          thresholds: 20
        }
      )
    )
  )
)

@Fil
Copy link
Contributor Author

Fil commented May 11, 2022

I've added that example to the notebook https://observablehq.com/@fil/plot-group-and-sort

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants