RFC: Rename `char` to make it clearer that it is a unicode codepoint/scalar value

Our `char` type is a [Unicode scalar value](http://www.unicode.org/glossary/#unicode_scalar_value) (codepoint excluding the surrogate range), which can lead to confusion because (a) it differs to other languages and (b) it doesn't directly encourage good unicode hygiene ("Oh, a character? that's what the user sees").

Possible names include `codepoint`, `ucs4`, or `rune` like Go.

Other languages names for a unicode scalar value/what `char` means:
- Haskell: `Char` is a codepoint (although surrogates are allowed)
- D: `dchar` (`char` is a "UTF-8 code unit" and `wchar` is a "UTF-16 code-unit" (i.e. aliases for `u8` and `u16`?): http://dlang.org/type.html)
- Go: `rune`
- C#/Java/Scala etc.: `char` is a 16-bit integer (i.e. UTF-16 code unit)
- C/C++: `char` is (normally) a byte, i.e. a UTF-8 code unit.

(Other languages like Python don't have a type for a single character and don't have a type called `char`, and so aren't meaningful for this comparison.)

(This issue brought to you by [reddit](http://www.reddit.com/r/rust/comments/1zlq21/should_rust_be_more_careful_with_unicode/).)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: Rename `char` to make it clearer that it is a unicode codepoint/scalar value #12730

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: Rename char to make it clearer that it is a unicode codepoint/scalar value #12730

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

RFC: Rename `char` to make it clearer that it is a unicode codepoint/scalar value #12730