-
Notifications
You must be signed in to change notification settings - Fork 710
Unit test Distribution.Utils.ShortText BinaryId fails #4644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Wikipedia says
Whoa. |
I need to look into why a BOM (which btw makes no sense whatsoever for UTF8 encodings) doesn't round-trip properly. Iirc I specifically tested such corner-cases in the implementation of http://hackage.haskell.org/package/text-short PS: I just noticed this is with the GHC 7.6.3 configuration, so this may be a problem with the legacy fallback... |
After some investigation, the issue is in fact for the
because |
I'll take a stab at harmonizing the |
This changes `decodeStringUtf8` to not replace U+FFFE and U+FFFF into U+FFFD, while `encodeStringUtf8` now replaces surrogate pairs (i.e. code-points U+D800 through U+DFFF which are invalid in UTF-8) with U+FFFD. Consequently, `decodeStringUtf8 . encodeStringUtf8` can now properly round-trip all scalar code-points (i.e. [U+0000..U+D7FF] ∪ [U+E000..U+10FFFF]). This should finally address haskell#4644
I'm confident this one's been fixed via #4928; I ran |
https://travis-ci.org/haskell-pushbot/cabal-binaries/builds/258926423
ping @ezyang
The text was updated successfully, but these errors were encountered: