fix: 🐛 fix conversion of client_position into offset_at_position#567
Merged
Conversation
Contributor
Author
|
WIP: The test needs to be improved, as it currently also passes for the previous implementation of Edit: No implementation considerations were found, only the test case needed to be adapted. |
bb7138c to
cc7408f
Compare
`offset_at_position` was incorrectly using code units instead of code points, leading to incorrect offset calculations for UTF-[8|16]. `client_num_units` returns code units, but the correct calculation needs to use code points, as returned by `len()`. This bug affects UTF-16 only in cases where code points exceeding the basic multilingual plane are present in lines preceding the given position. Adapt SAMPLE_STRING to cover this case. Introduce a test that checks all existent client positions' offset calculations in SAMPLE_STRING using an actually encoded string for comparison. Co-authored-by: Linus Heckemann <git@sphalerite.org>
cc7408f to
01cd614
Compare
Contributor
Author
|
The version I just uploaded is ready for review :) |
Collaborator
|
It looks good to me! I'd also like to get @lheckemann's opinion on this too seeing as they were working on it so recently. |
Contributor
|
@wolfskaempf and I worked on this together, and found this after finding that my change wasn't enough to get the language server to work with helix :) |
Collaborator
|
Oh! That's an approval then ✅🤓 |
tombh
approved these changes
Jul 16, 2025
Merged
Collaborator
|
Released in https://pypi.org/project/pygls/2.0.0a6 🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
offset_at_positionwas incorrectly using code units instead of code points, leading to incorrect offset calculations forUTF-[8|16].client_num_unitsreturns code units, but the correct calculation needs to use code points, as returned bylen().This bug affects UTF-16 only in cases where code points exceeding the basic multilingual plane are present in lines preceding the given position.
Adapt
SAMPLE_STRINGto cover this case.Introduce a test that checks all existent client positions' offset calculations in SAMPLE_STRING using an actually encoded string for comparison.
Code review checklist (for code reviewer to complete)
Automated linters
You can run the lints that are run on CI locally with: