dodge (beeswarms) #648

mbostock · 2022-01-04T03:54:50Z

Inspired by @yurivish’s Building a Better Beeswarm and @jtrim-ons’s non-force-directed beeswarms, but using interval-tree-1d for simpler (and hopefully faster?) intersection testing.

Plot.plot({
  width: 764,
  height: 400,
  color: {
    scheme: "purd",
    range: [0.15, 0.95]
  },
  r: {
    type: "linear",
    range: [0, 20]
  },
  marks: [
    Plot.dot(dataset, Plot.dodgeY({x: "x", r: "r", fill: "r"}))
  ]
})

Plot.plot({
  width: 1152,
  height: 400,
  x: {
    domain: [-1, 0],
    grid: true,
    clamp: true
  },
  r: {
    range: [0, 24]
  },
  marks: [
    Plot.dot(mobility, Plot.dodgeY("middle", {
      filter: d => d.in_lockdown === "TRUE",
      sort: "pop",
      reverse: true,
      x: "pct_change",
      r: "pop",
      fill: "currentColor"
    }))
  ]
})

Demo: https://observablehq.com/d/2055ab012188f381

Still to figure out:

How to specify the Rsep (separation radius option)? Perhaps dodge: {anchor: "center", separation: 2}?
Would it be better to specify y: "dodge" or some such (i.e., tied to the channel), instead of dodge: true?
Avoid materializing the R array if r is a constant?
Generalize this to other marks? Does Plot need a concept of “layouts” (in pixel space), similar to transforms (data space)?

Fixes #164.

Fil

I love the feature, but I'd prefer if we could have a minimal code footprint in src/dot.js, since that's one of the first files that someone would read to understand how to write a mark. Also if we introduce layouts, we'll probably have more of them with time :)

I can see two options:

Make it a plugin, rather than a core feature (as in https://observablehq.com/@fil/experimental-plot-beeswarm, or better—and that is a good opportunity to define what a plugin is)
Move the specific code to a different file in src (layouts/dodge.js ?), but still call it from within dot.js

How to specify the Rsep (separation radius option)?

{dodge: "center", separation: 2} would be consistent with the "flat" options we have elsewhere (e.g. stack options), but see below ↓

Would it be better to specify y: "dodge" or some such (i.e., tied to the channel), instead of dodge: true?

maybe something like y: {layout: "dodge", separation: 2, …moreLayoutOptions} ?

mbostock · 2022-01-04T15:36:56Z

I think this is useful enough that I want it included by default, rather than folks having to go look for a plugin and learn how to load it.

I don’t think we should optimize the readability of the implementation. Or at least, it’s not a priority relative to the exposed functionality of Plot which will be used by many more people. Of course I will endeavor to build a beautiful implementation but I do not expect Plot’s implementation to be “easy” to understand, or simple (see the bin transform, for example).

I do think there will be other layouts, and that we may want this functionality shared across marks, and thus it’s useful to think about both how the code should be structured within Plot, and whether we can generalize how it is exposed, too. The idea of binding a layout to a channel is interesting. Though, I think it may be limiting, as I expect there will be other cases where a layout may produce more than one channel. So, perhaps there’s a “layout” option, similar to a transform? Like layout: Plot.dodge(…)?

jtrim-ons · 2022-01-04T21:21:25Z

This looks good to me, and thanks for the mention.

i hope this isn't just shamelessly promoting my own idea, but I wonder if it might be worth considering a "compact beeswarm" option if the radii are all equal. Some examples of compact beeswarm vs dodge are in my initial post here, and an explanation of the method is here. For many datasets, I prefer it to the dodge function, just because it doesn't have a windblown look.

I don't think any of my compact beeswarm implementations have particularly good code style, but I could potentially contribute to an implementation if it would be useful. I also completely understand if it's not something you want to add to Plot at the moment.

mbostock · 2022-01-04T22:12:46Z

src/marks/dot.js

+    // Find the best y-value where this circle can fit.
+    for (let y of intervals.flat().sort(compare)) {
+      if (intervals.every(([lo, hi]) => y <= lo || y >= hi)) {
+        Y[i] = y;


Currently we’re setting Y[i] = 0 if there’s no conflict, but I wonder at least in the case of asymmetric swarms whether we should instead be setting it to R[i] + Rsep so that the bottom of the swarm is aligned (rather than the centers of the circles).

mbostock · 2022-01-07T00:35:04Z

Thanks @jtrim-ons! I think this initial algorithm is “good enough” to start, but I’m definitely willing to swap it out with something that performs better in the future. I’d prefer if the algorithm supported varying radius, though Plot could chose a special algorithm for fixed radius dots if we wanted.

The main thing to figure out with this PR is how much we want to generalize the concept of layouts, or if this will be a bespoke feature for dots. (Or if it can start as bespoke and evolve into a more general approach in the future.)

Fil · 2022-01-14T22:03:09Z

Also related, Dorling cartograms (for Plot.Carto :-) )

mbostock · 2022-01-25T21:45:44Z

I’ve now implemented this as a layout independent of marks, as a counterproposal to #691. A layout takes the same arguments as a mark’s render function and returns transformed values rather, providing a hook for transforming channels before rendering. Layouts are invoked similarly to option transforms, and thus can access (and even transform!) mark options, or take separate layout-specific options as desired.

Fil

It's really fun and powerful to see how we can add new layouts (I've created a circlePacking layout in less than 20 lines of code).

The dodge layout looks good to me.

As a follow-up or as part of this PR we want to extend it to work with a channel r, even if the underlying mark has no use for it, as this would enable labelling bubble charts with a variable radius. (As discussed in #708).

I'd also prefer if the default r for a symbol dot was consistent across the dot mark and the dodge layout (4.5 instead of 3), but not at the cost of overcomplicating things — (also discussed in #708, with a possible solution of checking this.r).

I'll try and work on these two issues, which are not blocking.

Besides that, it needs documentation. A starting point is available in #708.

Fil · 2022-01-26T18:51:46Z

During review we mentioned a possible (structural) change which would be to pass all the facets at once instead of calling the function once for each facet—thus allowing a layout to worked "globally" or "locally" on faceted data.

mbostock · 2022-01-26T19:32:49Z

Another layout idea: a strategy for avoiding occlusion with text labels, like @fil/occlusion. Though I’m not sure how the layout would determine the bounding box for the text labels. Would it use a heuristic like we do for line wrapping? Would it use an offscreen canvas and measureText?

Fil · 2022-01-27T14:11:45Z

src/plot.js

    const index = filter(markIndex.get(mark), channels, values);
+    if (mark.layout != null) values = mark.layout(index, scales, values, dimensions);


Suggested change

if (mark.layout != null) values = mark.layout(index, scales, values, dimensions);

if (mark.layout != null) values = mark.layout(index.slice(), scales, values, dimensions);

should we make a defensive copy?

Alternatively, if we assume that it is legal for the layout to modify the index, write it as:

Suggested change

if (mark.layout != null) values = mark.layout(index, scales, values, dimensions);

if (mark.layout != null) ({index, values} = mark.layout(index, scales, values, dimensions));

(and same in the facet call).

I think my preference would go to explicitly allowing a layout to change the index, for example a hexbin layout could group points into hexagons, and a unit chart/isotype layout could create several symbols out of (say) a bar.

should we make a defensive copy?

I feel like we’ve discussed this a bunch, so I want to reiterate what our conventions are.

We favor immutability. We typically use “copy on write” (e.g., returning a new object or array) rather than mutating the value in-place. This is especially true regarding arguments to a function, where the caller typically would not expect the passed values to be mutated as a side effect. In cases where we violate this principle, we should call it out (e.g., axes.js).

We aren’t defensive. If we invoke a user function, e.g. a custom reducer, we do not create a defensive copy to protect against mutation. Instead we assume that the user function will not modify the input. If user code violates this assumption, the behavior is undefined. “All bets are off.” We do this because creating a defensive copy adds significant overhead and would effectively penalize “well-behaved” users because of hypothetical bad behavior.

We create copies when transferring ownership. (This could be considered an exception or nuance to the previous rule.) For functions that return values, as opposed to function arguments, we consider the returned values to be owned by the user. The user should be allowed to mutate the values if they desire. This means we need to return a copy. For example this is used in plot.scale when we return a domain.

Mutation is also acceptable if it’s not “visible externally” as a performance optimization, though this should still be used with caution.

Fil · 2022-01-27T15:13:54Z

Suggestions:

Dodge: use the mark's this.r if present #711
~~call the layout once with all the frame indices #712~~
document layouts #713 ~~(needs a small update if we approve 712)~~

Besides these suggestions:

the question whether a layout is explicitly allowed or forbidden to modify the index (dodge (beeswarms) #648 (comment)); this is blocking if we expose the API (in the sense that it would mean a change to the API); but not blocking if we don't yet expose it. NO CHANGE
the need for Plot.text(dodgeY({r: channel})) — TBD, but probably not blocking

mbostock · 2022-01-27T18:54:12Z

I think my preference would go to explicitly allowing a layout to change the index, for example a hexbin layout could group points into hexagons, and a unit chart/isotype layout could create several symbols out of (say) a bar.

I think we should try implementing these and see where that takes us.

For hexbin, I’m guessing that’s more of a mark than a layout (or transform)? Because if you’re producing hexagonal bins, you almost certainly want to render them as hexagons, meaning that the representation is dictated and it’s not simply transforming channel values. I’ve been thinking about density contours similarly, which perhaps we could also implement.

For isotype and waffle charts (is that what you meant by unit chart?), it feels like it might be more of a transform than a layout. Or maybe that’s a mark, too? Or a pattern? In any case it should be easier to answer these questions through prototyping. 😁

mbostock · 2022-03-01T21:45:44Z

Superseded by #775.

mbostock requested a review from Fil January 4, 2022 03:54

Fil reviewed Jan 4, 2022

View reviewed changes

mbostock commented Jan 4, 2022

View reviewed changes

mbostock mentioned this pull request Jan 16, 2022

universal channel filter #671

Merged

mbostock force-pushed the mbostock/dot-dodge branch 4 times, most recently from 6f8d9cb to 08f0e0b Compare January 25, 2022 21:39

mbostock requested a review from Fil January 25, 2022 21:42

mbostock added 2 commits January 25, 2022 18:33

dodge (beeswarm)

8c5fb6f

compose layouts

7b9d931

mbostock force-pushed the mbostock/dot-dodge branch from b400163 to 7b9d931 Compare January 26, 2022 02:33

mbostock added 3 commits January 25, 2022 18:41

coerce padding

fb5d000

tweak logic

2effd27

facet layout

f3cf861

Fil reviewed Jan 26, 2022

View reviewed changes

propagate this

7a845f0

Fil mentioned this pull request Jan 26, 2022

first tentative for a layouts option #691

Closed

1 task

Fil reviewed Jan 27, 2022

View reviewed changes

Fil mentioned this pull request Jan 27, 2022

call the layout once with all the frame indices #712

Closed

Fil mentioned this pull request Feb 23, 2022

layouts: dodge, hexbin #775

Closed

mbostock closed this Mar 1, 2022

mbostock mentioned this pull request Mar 11, 2022

mark initializers #801

Merged

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dodge (beeswarms) #648

dodge (beeswarms) #648

Uh oh!

mbostock commented Jan 4, 2022 •

edited

Loading

Uh oh!

Fil left a comment •

edited

Loading

Uh oh!

mbostock commented Jan 4, 2022

Uh oh!

jtrim-ons commented Jan 4, 2022

Uh oh!

mbostock Jan 4, 2022

Uh oh!

mbostock commented Jan 7, 2022

Uh oh!

Fil commented Jan 14, 2022

Uh oh!

mbostock commented Jan 25, 2022

Uh oh!

Fil left a comment •

edited

Loading

Uh oh!

Fil commented Jan 26, 2022

Uh oh!

mbostock commented Jan 26, 2022

Uh oh!

Fil Jan 27, 2022

Uh oh!

mbostock Jan 27, 2022

Uh oh!

Fil commented Jan 27, 2022 •

edited

Loading

Uh oh!

mbostock commented Jan 27, 2022

Uh oh!

mbostock commented Mar 1, 2022

Uh oh!

Uh oh!

		const index = filter(markIndex.get(mark), channels, values);
		if (mark.layout != null) values = mark.layout(index, scales, values, dimensions);

	if (mark.layout != null) values = mark.layout(index, scales, values, dimensions);
	if (mark.layout != null) ({index, values} = mark.layout(index, scales, values, dimensions));

dodge (beeswarms) #648

dodge (beeswarms) #648

Uh oh!

Conversation

mbostock commented Jan 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fil left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mbostock commented Jan 4, 2022

Uh oh!

jtrim-ons commented Jan 4, 2022

Uh oh!

mbostock Jan 4, 2022

Choose a reason for hiding this comment

Uh oh!

mbostock commented Jan 7, 2022

Uh oh!

Fil commented Jan 14, 2022

Uh oh!

mbostock commented Jan 25, 2022

Uh oh!

Fil left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Fil commented Jan 26, 2022

Uh oh!

mbostock commented Jan 26, 2022

Uh oh!

Fil Jan 27, 2022

Choose a reason for hiding this comment

Uh oh!

mbostock Jan 27, 2022

Choose a reason for hiding this comment

Uh oh!

Fil commented Jan 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mbostock commented Jan 27, 2022

Uh oh!

mbostock commented Mar 1, 2022

Uh oh!

Uh oh!

mbostock commented Jan 4, 2022 •

edited

Loading

Fil left a comment •

edited

Loading

Fil left a comment •

edited

Loading

Fil commented Jan 27, 2022 •

edited

Loading