@@ -29,7 +29,7 @@ You may also be interested in the [grammar].
29
29
30
30
# Notation
31
31
32
- Rust's grammar is defined over Unicode codepoints , each conventionally denoted
32
+ Rust's grammar is defined over Unicode code points , each conventionally denoted
33
33
` U+XXXX ` , for 4 or more hexadecimal digits ` X ` . _ Most_ of Rust's grammar is
34
34
confined to the ASCII range of Unicode, and is described in this document by a
35
35
dialect of Extended Backus-Naur Form (EBNF), specifically a dialect of EBNF
53
53
- Square brackets are used to group rules.
54
54
- ` LITERAL ` is a single printable ASCII character, or an escaped hexadecimal
55
55
ASCII code of the form ` \xQQ ` , in single quotes, denoting the corresponding
56
- Unicode codepoint ` U+00QQ ` .
56
+ Unicode code point ` U+00QQ ` .
57
57
- ` IDENTIFIER ` is a nonempty string of ASCII letters and underscores.
58
58
- The ` repeat ` forms apply to the adjacent ` element ` , and are as follows:
59
59
- ` ? ` means zero or one repetition
@@ -66,9 +66,9 @@ This EBNF dialect should hopefully be familiar to many readers.
66
66
67
67
## Unicode productions
68
68
69
- A few productions in Rust's grammar permit Unicode codepoints outside the ASCII
69
+ A few productions in Rust's grammar permit Unicode code points outside the ASCII
70
70
range. We define these productions in terms of character properties specified
71
- in the Unicode standard, rather than in terms of ASCII-range codepoints . The
71
+ in the Unicode standard, rather than in terms of ASCII-range code points . The
72
72
section [ Special Unicode Productions] ( #special-unicode-productions ) lists these
73
73
productions.
74
74
@@ -91,10 +91,10 @@ production. See [tokens](#tokens) for more information.
91
91
92
92
## Input format
93
93
94
- Rust input is interpreted as a sequence of Unicode codepoints encoded in UTF-8.
94
+ Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8.
95
95
Most Rust grammar rules are defined in terms of printable ASCII-range
96
- codepoints , but a small number are defined in terms of Unicode properties or
97
- explicit codepoint lists. [ ^ inputformat ]
96
+ code points , but a small number are defined in terms of Unicode properties or
97
+ explicit code point lists. [ ^ inputformat ]
98
98
99
99
[ ^ inputformat ] : Substitute definitions for the special Unicode productions are
100
100
provided to the grammar verifier, restricted to ASCII range, when verifying the
@@ -147,7 +147,7 @@ comments beginning with exactly one repeated asterisk in the block-open
147
147
sequence (` /** ` ), are interpreted as a special syntax for ` doc `
148
148
[ attributes] ( #attributes ) . That is, they are equivalent to writing
149
149
` #[doc="..."] ` around the body of the comment (this includes the comment
150
- characters themselves, ie ` /// Foo ` turns into ` #[doc="/// Foo"] ` ).
150
+ characters themselves, i.e. ` /// Foo ` turns into ` #[doc="/// Foo"] ` ).
151
151
152
152
Line comments beginning with ` //! ` and block comments beginning with ` /*! ` are
153
153
doc comments that apply to the parent of the comment, rather than the item
@@ -333,14 +333,14 @@ Some additional _escapes_ are available in either character or non-raw string
333
333
literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of the
334
334
following forms:
335
335
336
- * An _ 8-bit codepoint escape _ escape starts with ` U+0078 ` (` x ` ) and is
337
- followed by exactly two _ hex digits_ . It denotes the Unicode codepoint
336
+ * An _ 8-bit code point escape _ starts with ` U+0078 ` (` x ` ) and is
337
+ followed by exactly two _ hex digits_ . It denotes the Unicode code point
338
338
equal to the provided hex value.
339
- * A _ 24-bit codepoint escape_ starts with ` U+0075 ` (` u ` ) and is followed
339
+ * A _ 24-bit code point escape_ starts with ` U+0075 ` (` u ` ) and is followed
340
340
by up to six _ hex digits_ surrounded by braces ` U+007B ` (` { ` ) and ` U+007D `
341
- (` } ` ). It denotes the Unicode codepoint equal to the provided hex value.
341
+ (` } ` ). It denotes the Unicode code point equal to the provided hex value.
342
342
* A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
343
- (` r ` ), or ` U+0074 ` (` t ` ), denoting the unicode values ` U+000A ` (LF),
343
+ (` r ` ), or ` U+0074 ` (` t ` ), denoting the Unicode values ` U+000A ` (LF),
344
344
` U+000D ` (CR) or ` U+0009 ` (HT) respectively.
345
345
* The _ backslash escape_ is the character ` U+005C ` (` \ ` ) which must be
346
346
escaped in order to denote * itself* .
@@ -410,7 +410,7 @@ Some additional _escapes_ are available in either byte or non-raw byte string
410
410
literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of the
411
411
following forms:
412
412
413
- * An _ byte escape_ escape starts with ` U+0078 ` (` x ` ) and is
413
+ * A _ byte escape_ escape starts with ` U+0078 ` (` x ` ) and is
414
414
followed by exactly two _ hex digits_ . It denotes the byte
415
415
equal to the provided hex value.
416
416
* A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
@@ -700,9 +700,9 @@ in macro rules). In the transcriber, the designator is already known, and so
700
700
only the name of a matched nonterminal comes after the dollar sign.
701
701
702
702
In both the matcher and transcriber, the Kleene star-like operator indicates
703
- repetition. The Kleene star operator consists of ` $ ` and parens , optionally
703
+ repetition. The Kleene star operator consists of ` $ ` and parenthesis , optionally
704
704
followed by a separator token, followed by ` * ` or ` + ` . ` * ` means zero or more
705
- repetitions, ` + ` means at least one repetition. The parens are not matched or
705
+ repetitions, ` + ` means at least one repetition. The parenthesis are not matched or
706
706
transcribed. On the matcher side, a name is bound to _ all_ of the names it
707
707
matches, in a structure that mimics the structure of the repetition encountered
708
708
on a successful match. The job of the transcriber is to sort that structure
@@ -1203,9 +1203,9 @@ the guarantee that these issues are never caused by safe code.
1203
1203
1204
1204
[ noalias ] : http://llvm.org/docs/LangRef.html#noalias
1205
1205
1206
- ##### Behaviour not considered unsafe
1206
+ ##### Behavior not considered unsafe
1207
1207
1208
- This is a list of behaviour not considered * unsafe* in Rust terms, but that may
1208
+ This is a list of behavior not considered * unsafe* in Rust terms, but that may
1209
1209
be undesired.
1210
1210
1211
1211
* Deadlocks
@@ -1298,7 +1298,7 @@ specific type, but may implement several different traits, or be compatible with
1298
1298
several different type constraints.
1299
1299
1300
1300
For example, the following defines the type ` Point ` as a synonym for the type
1301
- ` (u8, u8) ` , the type of pairs of unsigned 8 bit integers. :
1301
+ ` (u8, u8) ` , the type of pairs of unsigned 8 bit integers:
1302
1302
1303
1303
```
1304
1304
type Point = (u8, u8);
@@ -1952,7 +1952,7 @@ type int8_t = i8;
1952
1952
1953
1953
### Crate-only attributes
1954
1954
1955
- - ` crate_name ` - specify the this crate's crate name.
1955
+ - ` crate_name ` - specify the crate's crate name.
1956
1956
- ` crate_type ` - see [ linkage] ( #linkage ) .
1957
1957
- ` feature ` - see [ compiler features] ( #compiler-features ) .
1958
1958
- ` no_builtins ` - disable optimizing certain code patterns to invocations of
@@ -3464,7 +3464,7 @@ is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to
3464
3464
UTF-32 string.
3465
3465
3466
3466
A value of type ` str ` is a Unicode string, represented as an array of 8-bit
3467
- unsigned bytes holding a sequence of UTF-8 codepoints . Since ` str ` is of
3467
+ unsigned bytes holding a sequence of UTF-8 code points . Since ` str ` is of
3468
3468
unknown size, it is not a _ first-class_ type, but can only be instantiated
3469
3469
through a pointer type, such as ` &str ` or ` String ` .
3470
3470
0 commit comments