Skip to content

facetReindex #1057

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from
Closed

facetReindex #1057

wants to merge 17 commits into from

Conversation

Fil
Copy link
Contributor

@Fil Fil commented Sep 23, 2022

In some transforms we want to reindex the facets, because of overlapping indices. The typical case is the stack transform, where an element that is present in two facets could be stacked to a certain y position in facet 0, and a different y position in facet 1. To be able to store that information in a flat array structure, we need to "reindex", ie. give each of the data points a deduplicated index across facets. This means that, starting from n elements, we might generate a new y channel that has (up to) n*m elements, where m is the number of facets. The transform leaves us with channels of length n, and other channels of length between n+1 and n * m, which we need to reconcile.

Previously, I thought we'd need to "expand" the shorter channels, that is, copy all their values as many times as necessary to cover all the reindexed pointers. But this proved difficult as the transform doesn't know what those channels are. Also, memory hog.

However, it seems we can do the opposite: "shorten" the long channels and go back to the original facet index. (This allows to keep the short ones unchanged.) The catch is that we're only allowed to do this in places where we don't need the full range of values anymore.

In particular, when we reach the rendering stage, we're inside a facet, and from each "long" channel we can create a "shortened" channel to pass to mark.filter() + mark+render(). Short channels will be left unchanged.

To make this fully work, all intermediary transforms that read values will have to be careful about "reindexed facets". This is a TODO, but it should be possible to abstract this by changing any function that does i => X[i] into (pseudocode) (X is a short channel and facet.reindex) ? (i => X[facet.reindex(i)]) : (i => X[i]) .

Alternative to #1041

@Fil Fil force-pushed the fil/facet-reindex branch 2 times, most recently from a3075a7 to 78a5708 Compare September 29, 2022 09:27
@Fil
Copy link
Contributor Author

Fil commented Sep 29, 2022

All transforms are now compatible with facet reindexing & planning.

Supported transforms:

  • bin (accepts reindexed facets, returns new facets)
  • group (accepts reindexed facets, returns new facets)
  • dodge (accepts reindexed facets)
  • hexbin (accepts reindexed facets, returns new facets)
  • map (reindexes facets)
  • normalize (reindexes facets)
  • select (accepts reindexed facets, returns them with the plan)
  • stack (reindexes facets)
  • window (reindexes facets)
  • tree (accepts reindexed facets, returns new facets) — not tested

Not applicable:

  • basic
  • identity
  • inset
  • interval

@Fil Fil marked this pull request as ready for review September 29, 2022 15:56
@Fil Fil requested review from mbostock and duaneatat September 29, 2022 15:56
@Fil
Copy link
Contributor Author

Fil commented Sep 29, 2022

  • I took a small shortcut for bin/group/hexbin. Fix.

@Fil Fil force-pushed the fil/facet-reindex branch from e2a5d1f to aee4469 Compare September 30, 2022 14:29
@Fil Fil mentioned this pull request Sep 30, 2022
5 tasks
@Fil Fil changed the title facetReindex for stack facetReindex Sep 30, 2022
@Fil
Copy link
Contributor Author

Fil commented Oct 3, 2022

too much code! superseded by #1068.

@Fil Fil closed this Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants