Fil/reduce empty bins to null #491

Fil · 2021-08-09T17:22:26Z

supersedes #489

Fil · 2021-08-09T17:40:50Z

(tested with #490 : 👍)

mbostock

I think if we want to return the empty bins, then we should incorporate the binfilter optimization here, and possibly we optimize rendering such that zero-width or zero-height rects and bars are not instantiated in SVG. We should not change the semantics of the reducers that should return zero for empty bins.

mbostock · 2021-08-09T19:04:01Z

src/transforms/group.js

@@ -199,13 +199,14 @@ const reduceLast = {
 const reduceCount = {
  label: "Frequency",
  reduce(I) {
-    return I.length;
+    return I.length || NaN;


This feels wrong. The count is zero for an empty bin, not NaN. Same with distinct and proportion. I agree that other reducers (such as first) should return undefined for an empty bin, but they already do, right?

Returning NaN is analogous to ignoring the empty bins and has the same result, except for line and area where it's semantically better (an empty bin results in a cut of the line rather than an interpolation between full bins). But I'll check if the check on emptiness can be done before the reducer, because that indeed feels wrong.

Note: the only other reducer which returned 0 was sum. I believe there's a tiny difference between a sum of empty and a sum of 1 and -1.

I understand that the effect is (roughly) the same on the resulting chart, but it changes the semantics of the bin transform, which I feel is undesirable.

mbostock · 2021-08-09T19:04:14Z

src/transforms/bin.js

-
-function nonempty2([, {size}]) {
-  return size > 0;
+  return [x0, x1, set.size ? I => I.filter(set.has, set) : () => []]; // TODO optimize


This seems like a nice optimization (if we want to return empty bins). 👍

creates channel values that will result in the corresponding mark to be filtered out, or skipped in case of a grouped mark

…lement is not passing the athletes-height-weight-bin-stroke.js needed a refinement if we wanted to ignore all the rects that we didn't want to show (instead of having invisible rects on the small non-empty bins)

(this also avoids looking up I[-1] (last) or I[0] with non-existent indices)

Fil · 2021-08-10T09:00:17Z

In terms of information, we want to be able to tell "there is no data" separately from "the result is zero", because in one case the result can be skipped by a line / ignored by a bar, whereas in the other it should be drawn even if it's a zero. Would you have the same gut feeling if the reducers returned null instead of NaN for "perhaps invalid bins"?

The code could work with nulls, even though it would mean a special case in the stack transform. (What to do with nulls when stacking? If it's bars that we're stacking, we can ignore nulls, but if if it's areas we probably want the ribbon to get thinner at that point—each mark would have to call the default stack transform with the relevant option to "ignore nulls" or "zero nulls").

The stack transform can then filter out invalid bars, which should be an option since we'll want it if calling from Plot.barY but not from Plot.areaY

Fil · 2021-08-10T18:13:36Z

TODO: this filtering of invalid/empty spots in Plot.stack should be an option: we'll want it if calling from Plot.barY but not from Plot.areaY.

mbostock · 2021-08-10T19:05:22Z

I see. I think what I’m objecting to is that it feels “icky” to encode (overload) the empty bit into the value returned by the reducer, and instead there should be some out-of-band way to communicate which bins are empty to things that care. I’ll think about this some more and try to come up with an alternative.

mbostock · 2021-08-11T00:42:14Z

I’ve mulled on this and I’d like to proceed with #495. I do think it makes sense that the bin transform filters by default (like the group transform, which does not return empty groups). And I feel an explicit filter option is more straightforward than giving the reducers special behavior for the empty case, and then likewise the stack transform needing special behavior for null and NaN. Thank you for engaging in this exploration with me and let’s chat about this tomorrow!

… by number of cylinders The bins are sorted by decreasing r, so that they are all visible. The example would benefit from stackR (#197). It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard).

* This example plot computes the median of cars' economy (mpg), grouped by number of cylinders The bins are sorted by decreasing r, so that they are all visible. The example would benefit from stackR (#197). It could also benefit from a strategy to create missing values for the line, so that it's broken when there are no data. However, it won't work with an approach such as "return empty bins" (#495), because returning empty bins will not create the *z* values for each and every category, which would be necessary if we wanted to create broken lines. This shows that a generic foolproof solution to #351 will require much more than #495 (and #489 and #491 are not better in that regard). * Update test/plots/cars-mpg.js Co-authored-by: Mike Bostock <[email protected]> * Update test/plots/cars-mpg.js Co-authored-by: Mike Bostock <[email protected]> * zero, not filter * group, not bin * remove console.log * stroke, not fill Co-authored-by: Mike Bostock <[email protected]>

Fil requested a review from mbostock August 9, 2021 17:41

Fil changed the title ~~Fil/reduce empty bins to null~~ Fil/reduce empty bins to NaN Aug 9, 2021

mbostock requested changes Aug 9, 2021

View reviewed changes

Fil added 4 commits August 9, 2021 21:41

reduce empty bins to null instead of 0

ad41598

creates channel values that will result in the corresponding mark to be filtered out, or skipped in case of a grouped mark

optimize the case of empty bins

fee5b9c

test for empty outside the reducers

50de216

(this also avoids looking up I[-1] (last) or I[0] with non-existent indices)

Fil force-pushed the fil/reduce-empty-bins-to-null branch from 7126482 to 50de216 Compare August 9, 2021 19:41

switch to null when reducing empty bins

aecdea7

The stack transform can then filter out invalid bars, which should be an option since we'll want it if calling from Plot.barY but not from Plot.areaY

Fil changed the title ~~Fil/reduce empty bins to NaN~~ Fil/reduce empty bins to null Aug 10, 2021

mbostock mentioned this pull request Aug 11, 2021

sort and filter for bin and group #495

Merged

mbostock closed this Aug 11, 2021

Fil mentioned this pull request Aug 11, 2021

Cars MPG example plot #496

Merged

Fil deleted the fil/reduce-empty-bins-to-null branch August 11, 2021 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fil/reduce empty bins to null #491

Fil/reduce empty bins to null #491

Fil commented Aug 9, 2021

Fil commented Aug 9, 2021

mbostock left a comment •

edited

Loading

mbostock Aug 9, 2021

Fil Aug 9, 2021

mbostock Aug 9, 2021

mbostock Aug 9, 2021

Fil commented Aug 10, 2021

Fil commented Aug 10, 2021

mbostock commented Aug 10, 2021 •

edited

Loading

mbostock commented Aug 11, 2021

Fil/reduce empty bins to null #491

Fil/reduce empty bins to null #491

Conversation

Fil commented Aug 9, 2021

Fil commented Aug 9, 2021

mbostock left a comment • edited Loading

Choose a reason for hiding this comment

mbostock Aug 9, 2021

Choose a reason for hiding this comment

Fil Aug 9, 2021

Choose a reason for hiding this comment

mbostock Aug 9, 2021

Choose a reason for hiding this comment

mbostock Aug 9, 2021

Choose a reason for hiding this comment

Fil commented Aug 10, 2021

Fil commented Aug 10, 2021

mbostock commented Aug 10, 2021 • edited Loading

mbostock commented Aug 11, 2021

mbostock left a comment •

edited

Loading

mbostock commented Aug 10, 2021 •

edited

Loading