Closed
Description
The byte sequence [34, 228, 166, 164, 110, 237, 166, 164, 44, 34]
("䦤n���,"
, quotes are part of the string itself) is considered valid utf8 by ECMAScript (or at least nodejs and firefox), but not by the rust std library.
Not knowing enough about unicode and utf8, I'm just assuming that rust is doing this incorrectly, since both v8 and spidermonkey accept it as valid utf8.
JSON.parse('"䦤n���,"')
in javascript returns a string, whereas in rust:
println!("{:?}", String::from_utf8(vec![34u8, 228, 166, 164, 110, 237, 166, 164, 44, 34]));
> Err(FromUtf8Error { bytes: [34, 228, 166, 164, 110, 237, 166, 164, 44, 34], error: Utf8Error { valid_up_to: 5, error_len: Some(1) } })
rustc --version --verbose
binary: rustc
commit-hash: de3d640f59c4fa4a09faf2a8d6b0a812aaa6d6cb
commit-date: 2018-10-01
host: x86_64-unknown-linux-gnu
release: 1.31.0-nightly
LLVM version: 8.0
Metadata
Metadata
Assignees
Labels
No labels