Skip to content

Modify replacement properties of encodeStringUtf8/decodeStringUtf8 #4928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 4, 2017

Conversation

hvr
Copy link
Member

@hvr hvr commented Dec 3, 2017

This should finally address #4644

Please include the following checklist in your PR:

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.

Please also shortly describe how you tested your change. Bonus points for added tests!

hvr added 2 commits December 3, 2017 22:31
This changes `decodeStringUtf8` to not replace U+FFFE and U+FFFF into
U+FFFD, while `encodeStringUtf8` now replaces surrogate pairs
(i.e. code-points U+D800 through U+DFFF which are invalid in UTF-8)
with U+FFFD.

Consequently, `decodeStringUtf8 . encodeStringUtf8` can now properly
round-trip all scalar code-points
(i.e. [U+0000..U+D7FF] ∪ [U+E000..U+10FFFF]).

This should finally address haskell#4644
@23Skidoo 23Skidoo merged commit 6e1871a into haskell:master Dec 4, 2017
@23Skidoo
Copy link
Member

23Skidoo commented Dec 4, 2017

Merged, thanks!

@hvr hvr deleted the pr/issue-4644 branch December 4, 2017 08:45
@hvr hvr added this to the 2.2 milestone Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants