Skip to content

Interval (or series?) transform for line and area #597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mbostock opened this issue Nov 26, 2021 · 6 comments · Fixed by #792
Closed

Interval (or series?) transform for line and area #597

mbostock opened this issue Nov 26, 2021 · 6 comments · Fixed by #792
Labels
enhancement New feature or request

Comments

@mbostock
Copy link
Member

It’d be nice to support the interval transform for line and area marks such that if there’s missing data, a gap is automatically drawn rather than interpolating. I imagine that it would only be supported on lineX, lineY, areaX, and areaY, where it would apply to y, x, y, and x respectively.

@mbostock mbostock added the enhancement New feature or request label Nov 26, 2021
@mbostock mbostock changed the title Interval for line and area Interval transform for line and area Nov 26, 2021
@Fil
Copy link
Contributor

Fil commented Dec 13, 2021

It would also be useful for textY: #611

@Fil
Copy link
Contributor

Fil commented Dec 21, 2021

I love the idea, which seems simpler than the bin+empty bins filter that we currently have. The difficulty is that instead of a point being defined, we want to control when two consecutive points are connected, and create a multiline. Consecutive points are determined in the final call to shapeLine, but their original values are already scaled and can't be compared with the interval. (It's the first time that an issue comes from having applied the scales on the channels.) 🤔

@mbostock mbostock changed the title Interval transform for line and area Interval (or series?) transform for line and area Feb 26, 2022
@mbostock
Copy link
Member Author

mbostock commented Feb 28, 2022

Here’s a “real world” example: https://observablehq.com/@mbostock/npm-daily-downloads?name=@observablehq/mtcars

This package is rarely downloaded, and the npm endpoint doesn’t return an entry when there were zero downloads oops, my own data processing was dropping zero-value entries. So you see this:

untitled (17)

But what you should see is this:

untitled (18)

Visual difference:

untitled (19)

The correct graph can be produced by changing the areaY definition:

Plot.areaY(data, {x: "date", y: "value", fill: "steelblue", curve: "step-before"})

To use the bin transform with a null filter:

Plot.areaY(data, Plot.binX({y: "sum", filter: null}, {x: "date", interval: d3.utcDay, y: "value", fill: "steelblue", curve: "step-before"}))

Which makes me think maybe we just need a convenience shorthand? Something like…

function seriesX(interval, options) {
  return binX({y: "sum", filter: null}, {...options, interval});
}

Then you’d say

Plot.areaY(data, Plot.seriesX(d3.utcDay, {x: "date", y: "value", fill: "steelblue", curve: "step-before"}))

Or maybe it’d be something supported by the mark, e.g.,

Plot.areaY(data, {x: "date", series: d3.utcDay, y: "value", fill: "steelblue", curve: "step-before"})

I think we probably don’t want to call this the interval option because that could have a different interpretation (as it does e.g. for the bar mark where x is transformed into x1 and x2).

@Fil
Copy link
Contributor

Fil commented Feb 28, 2022

Yes I think it makes sense as a mark option, since the question is "how do you interpolate over missing values", and that's only for grouped marks. The implementation could be different (in particular, we could keep the x values exact, imply an ascending order, and introduce intermediate points at (x0+p, 0) (x1-p, 0) each time the difference between two consecutive x (x0 and x1) is larger or equal than 2 periods—or something like that?).

@Fil
Copy link
Contributor

Fil commented Feb 28, 2022

Here's a prototype: https://observablehq.com/@fil/npm-daily-downloads-597

@mbostock
Copy link
Member Author

On the npm daily downloads topic, I realized it was me that was introducing the gaps, not the npm API. I was doing something like:

if (value > 0) {
  data.push({date: new Date(date), value});
}

I’ve changed this to instead truncate the series at the earliest non-zero value (without this, the chart would always start at 2015-01-01, rather than the date the package was first created).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants