Skip to content

explore a slightly more generic solution #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
michaelficarra opened this issue Apr 16, 2025 · 7 comments
Open

explore a slightly more generic solution #6

michaelficarra opened this issue Apr 16, 2025 · 7 comments

Comments

@michaelficarra
Copy link
Member

michaelficarra commented Apr 16, 2025

Instead of providing just a specialised function that would be used like

array.sort(String.comparingByCodePoints);

it would be similarly ergonomic and more generally applicable to do something like

array.sort(Iterator.comparingBy(s => s.codePoints()));

assuming we had (roughly)

String.prototype.codePoints = function() {
  return this[Symbol.iterator]().map(c => c.codePointAt(0)));
};

and also (roughly)

Iterator.comparingBy = getIter => {
  const NIL = {};
  return (a, b) => {
    let as = getIter(a);
    let bs = getIter(b);
    for (let [a, b] of Iterator.zip([as, bs], { mode: "longest", padding: [NIL, NIL] )) {
      if (a === NIL) return -1;
      if (b === NIL) return 1;
      if (a < b) return -1;
      if (b < a) return 1;
    }
    return 0;
  };
}
@ljharb
Copy link
Member

ljharb commented Apr 16, 2025

that's a really neat idea, and we should do that! (but i think we should also add a single comparator function)

@michaelficarra
Copy link
Member Author

Eh, I'm open to having both, but the difference in ergonomics here is really really small to me.

@waldemarhorwat
Copy link

I'd still want codePointCompare, which fixes the Unicode bug in the simplest and most efficient way.

Sorting is used a lot. An unanswered question here is how efficient the generalization would be compared to a hand-written codePointCompare which does not use iterators and can take advantage of fast paths for code units below 0xD800.

@bakkot
Copy link

bakkot commented Apr 16, 2025

Iterator.comparingBy(s => s.codePoints())

A comparingBy function which requires you to map each input to an iterable of values which have a natural order is not very general because not all domains can easily be mapped in an order-preserving way to a list of number/bigint/string. Your comparingBy would not be usable for sorting an array of arrays of strings with localeCompare, for example. Instead you want to take a comparator over the items in each iterable, that is, something more like

function compareLexicographic(itemComparator, getIter = x => x[Symbol.iterator]()) {
  const NIL = {};
  return function (as, bs) {
    for (let [a, b] of Iterator.zip([getIter(as), getIter(bs)], { mode: 'longest', padding: [NIL, NIL] })) {
      if (a === NIL) return -1;
      if (b === NIL) return 1;
      let itemResult = itemComparator(a, b);
      if (itemResult !== 0) return itemResult;
    }
    return 0;
  }
}

which you'd use here as array.sort(Iterator.compareLexicographic(Number.compare, s => s.codePoints())).

And at that point I don't think you can reasonably ask users to come up with that on their own.

@michaelficarra
Copy link
Member Author

@waldemarhorwat If there's a performance difference, I would find that more compelling than simply an ergonomic difference. But as with the object property counting proposal, I have a lot of faith in engines to do some clever pattern recognition.

@bakkot
Copy link

bakkot commented Apr 16, 2025

More faith in the engines than the engines have themselves, I think.

@kriskowal
Copy link
Member

kriskowal commented Apr 16, 2025

Long, long ago, I was very interested in a by(relation, compare?) function that elevated any relation to a comparator, but where the comparator also captured the relation and base-comparator as properties so that algorithms like sort could do a Schwartzian transform and avoid redundant pairwise comparison for expensive relations.

array.sort(by(({x, y}) => (x**2 + y**2)**0.5))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants