[Performance] Make determining whether a code point represents a combining mark faster #1719
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The call
unicode.In(r, unicode.Mark)introduced in 5c8a233 is a performance hog. Profiling reveals that a double-digit percentage of Micro's total CPU time is spent inside that function. The reason is that the range tableunicode.Markhas more than 100 entries, including many highly exotic code points not commonly thought of as "marks".This PR essentially reverts 5c8a233, except that instead of constructing a range table, it uses a hand-written function that only performs the necessary comparisons. This avoids a bunch of branching and iteration logic from
unicode.In, which makes it even faster than the previous approach.The benchmarks demonstrate that this is indeed a huge performance improvement (I had to increase the p-value threshold from 0.05 to 0.15 for benchstat to show this, as 3 benchmark runs are unfortunately still not sufficient to reach p-values that low in most situations):
Of course, this code is not perfectly equivalent to using
unicode.Mark, but I don't think getting every single corner case for every exotic script correct is worth slowing Micro down by 20%.It would be really nice if such performance regressions could be caught automatically by CI, but Travis doesn't provide a consistent runtime environment AFAIK. Do you know a better option? How are the nightly builds generated?
Also, why are the Unicode helper functions from the highlight package duplicated? Micro depends on that package, so why are they not simply imported?