Skip to content

Zero-dimensional grouping. #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Mar 22, 2021
Merged

Zero-dimensional grouping. #264

merged 5 commits into from
Mar 22, 2021

Conversation

mbostock
Copy link
Member

Fixes #261.

This introduces a new groupZ transform that groups only on the first of {z, fill, stroke}; it ignores x and y. The default output channel is fill, but convenience aliases groupZX, groupZY, and groupZR are provided to output to those respective channels.

The x channel is now optional to the groupX transform; if missing, it groups only on the first of {z, fill, stroke}, like groupZY. Same for the y channel for the groupY transform, and the x and y channels for the group transform.

The groupX transform now never groups on y and the groupY transform now never groups on x; use the group transform if this is desired.

For example, for a one-dimensional stacked bar chart:

Plot.barX(penguins, Plot.stackX(Plot.groupZX({fill: "species", normalize: true})))

There’s still a lot of overlap between the group transform and the reduce transform, which behaves differently. I think we should try to unify the code, and I think the grouping behavior of the reduce transform should probably behave like the group transform in this PR (meaning there probably should be a variant of reduce that groups on x and y, and a variant that groups on neither).

@mbostock mbostock requested a review from Fil March 21, 2021 18:54
return typeof value === "undefined" || (value && value.toString === objectToString) ? value : {value};
return value === undefined || (value &&
value.toString === objectToString &&
typeof value.transform !== "function") ? value : {value};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to detect Plot.identity as the default value. Related #262.

@mbostock
Copy link
Member Author

mbostock commented Mar 21, 2021

Like, I think if the group transform supported configurable outputs like the reduce transform currently does, rather than only being able to compute the frequency channel (L), then the group transform could subsume the reduce transform.

Edit: Although, reducing code duplication isn’t particularly urgent. I think the real urgency is that the API feels consistent from the outside.

@Fil
Copy link
Contributor

Fil commented Mar 22, 2021

I can see how Plot.barY(data, groupX({x: "dim"})) is similar to:

Plot.barY(data, Plot.reduceX({ x: ([x]) => x, y: d => d.length  }, {x: "dim", y: "dim"}))

(But this can't be extended to normalize, I think?).

Not sure if unifying means that groupX would become a shorthand for a more complete reduceX, or would share a similar API with {outputs}, {options}.

Sorry this is not super helpful, I'm beginning to get a bit lost with all the new names. I think I'll need an index card for transforms, like the one for marks.

@mbostock
Copy link
Member Author

  • group - group on x, y, and {z, fill, stroke}, output to fill (by default)
  • groupR - group on x, y, and {z, fill, stroke}, output to r
  • groupX - group on x and {z, fill, stroke}, output to y (by default)
  • groupY - group on y and {z, fill, stroke}, output to x (by default)
  • groupZ - group on {z, fill, stroke}, output to fill (by default)
  • groupZX - group on {z, fill, stroke}, output to x
  • groupZY - group on {z, fill, stroke}, output to y
  • groupZR - group on {z, fill, stroke}, output to r

@mbostock
Copy link
Member Author

Okay, I’m going to land this so I can resume working on documentation, but let me know if you have thoughts.

@mbostock mbostock merged commit c6cf7d2 into main Mar 22, 2021
@mbostock mbostock deleted the mbostock/group0 branch March 22, 2021 21:13
@Fil
Copy link
Contributor

Fil commented Mar 22, 2021

The current list is convenient, but I wonder if it would not be more memorable if we did this list:

  • groupXY - group on x, y and z, out: fill (by default)
  • groupX - group on x and z, out: y (by default)
  • groupY - group on y and z, out: x (by default)
  • group - group on z, out: fill (by default)

any non-default output would have to use an explicit {out: "r"}. Maybe less convenient though.

@mbostock
Copy link
Member Author

@Fil If we did that, then I think we’d need to rename marks e.g. Plot.cell ↦ Plot.cellXY, Plot.dot ↦ Plot.dotXY.

@Fil
Copy link
Contributor

Fil commented Mar 23, 2021

Ah, true. Here's another suggestion, where the capital letter only denotes the out channel:

  • group - group on x, y, z, out: fill (by default)
  • groupX - group on x, y, z, out: x, y: identity (by default)
  • groupY - group on x, y, z, out: y, x: identity (by default)
  • groupR - group on z, out: r

This would mean switching groupX and groupY.

In the examples:

  • barY(…groupX) would become barY(…groupY)
  • Plot.barY(data, Plot.stackY(Plot.groupX({x: "species", fill: "island"}))) would become Plot.barY(data, Plot.stackY(Plot.groupY({x: "species", fill: "island"})))

(If I'm not mistaken) this approach seems more consistent (the letter denotes only the out channel instead of in/out; we use X throughout for horizontal bars), and introduces less symbols. I've created a branch to test it: #266

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

“Zero-dimensional” group and more flexible normalization…
2 participants