-
Notifications
You must be signed in to change notification settings - Fork 185
groupX becomes groupY; removes groupZX et al. #266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What about the bin transform? |
This makes it more difficult to group on one dimension when both x and y are defined: you have to pass e.g. I don’t think we have an example yet of this technique, but here’s a related one using the bin transform I posted in Slack: Plot.plot({
marginLeft: 100,
height: 640,
x: {
grid: true
},
color: {
scheme: "YlGnBu",
zero: true
},
marks: [
Plot.barX(athletes, Plot.binX({x: "weight", y: "sport", thresholds: 60, normalize: "z", out: "fill"}))
]
}) |
Yes the example above becomes a bit awkward Plot.plot({
marginLeft: 100,
height: 640,
x: {
grid: true
},
color: {
scheme: "YlGnBu",
zero: true
},
marks: [
Plot.barX(athletes, {
...Plot.bin({
x: "weight",
y: null,
z: "sport",
thresholds: 60,
normalize: "z"
}),
y: ([d]) => d["sport"]
})
]
}) (edited into Plot.bin) |
Hmm, I think it’s pretty confusing that binX now bins the y dimension… |
With this PR binX bins data and outputs x. This seems consistent with other symbols such as reduceX, barX, which all do something with y (and z), and express their result in the x dimension (what I've called "expansion" in https://observablehq.com/d/f6763557b9e3f19a ; group and bin are the two outliers in that respect, so it would also reduce confusion in that table). But it's true that the other X symbols consume x, so we could instead focus on inputs. But then, shouldn't group be named groupXY, and groupZ named group, and there would be no groupR or groupZX but an explicit {out: "r"} or {out: "x"}, and probably a few other changes? (I haven't followed this idea to the end.) I'm gently pushing this because the current notation looks quite asymetric, with X sometimes denoting input (groupX) and sometimes output (groupZX). (EDIT: re-reading https://observablehq.com/d/f6763557b9e3f19a#expansion I realize I already had suggested to switch groupX and groupY, though I'd forgotten that. Hello, february-me.) |
Right, my preference is that the name be associated with the channels that determine grouping, rather than the output channels (which are configurable). The groupZX, groupZY, groupZR, and groupR symbols are convenience helpers and we could drop them in favor of specifying the out option explicitly if that is desired. The competition for the “group” name between zero-dimensional groupZ and two-dimensional groupXY is an interesting observation. On the one hand I agree that “group” makes sense for the zero-dimensional case, but we already use “cell”, “dot”, “rect” etc. for the two-dimensional case for marks, so I don’t think it would be wrong to use “group” for two-dimensional here and “groupZ” for zero-dimensional. It is a little arbitrary, though. I don’t think we want to use “cellXY”, “dotXY”, “rectXY” etc. because that adds verbosity and those marks will never have a zero-dimensional equivalent. I think we still want groupZ, groupX, and groupY, rather than passing x: null and/or y: null to the group operator, so that it’s easier to control whether the x and y dimension gets aggregated or not (per the “thoughts on group” notebook). I do think the group transform probably needs to subsume the reduce transform by allowing arbitrary outputs. I want to think that through. |
Yes. With a detailed cheat sheet I'm now less confused, but I feel these symbols are redundant, and in a way will slow the discovery of the "out" option. |
Superseded by #272. |
cf. #264 (comment)