|
110 | 110 | \indextext{line splicing}%
|
111 | 111 | If the first translation character is \unicode{feff}{byte order mark},
|
112 | 112 | it is deleted.
|
113 |
| -Each sequence of a backslash character (\textbackslash) |
| 113 | +Each sequence of a backslash character (\unicode{005c}{reverse solidus}) |
114 | 114 | immediately followed by
|
115 |
| -zero or more whitespace characters other than new-line followed by |
| 115 | +zero or more \grammarterm{whitespace-character}s other than new-line followed by |
116 | 116 | a new-line character is deleted, splicing
|
117 | 117 | physical source lines to form \defnx{logical source lines}{source line!logical}. Only the last
|
118 | 118 | backslash on any physical source line shall be eligible for being part
|
|
126 | 126 | shall be processed as if an additional new-line character were appended
|
127 | 127 | to the file.
|
128 | 128 |
|
129 |
| -\item The source file is decomposed into preprocessing |
130 |
| -tokens\iref{lex.pptoken} and sequences of whitespace characters |
131 |
| -(including comments). A source file shall not end in a partial |
| 129 | +\item |
| 130 | +\indextext{whitespace}% |
| 131 | +\indextext{comment}% |
| 132 | +\indextext{token!preprocessing}% |
| 133 | +The source file is decomposed into preprocessing |
| 134 | +tokens\iref{lex.pptoken} and whitespace\iref{lex.whitespace} (sequences of \grammarterm{whitespace-character}s |
| 135 | +and comments). A source file shall not end in a partial |
132 | 136 | preprocessing token or in a partial comment.
|
133 | 137 | \begin{footnote}
|
134 | 138 | A partial preprocessing
|
|
140 | 144 | would arise from a source file ending with an unclosed \tcode{/*}
|
141 | 145 | comment.
|
142 | 146 | \end{footnote}
|
143 |
| -Each comment\iref{lex.comment} is replaced by one space character. New-line characters are |
144 |
| -retained. Whether each nonempty sequence of whitespace characters other |
145 |
| -than new-line is retained or replaced by one space character is |
| 147 | +Each comment\iref{lex.comment} is replaced by one \unicode{0020}{space} character. New-line characters are |
| 148 | +retained. Whether each nonempty sequence of \grammarterm{whitespace-character}s other |
| 149 | +than new-line is retained or replaced by one \unicode{0020}{space} character is |
146 | 150 | unspecified.
|
147 | 151 | As characters from the source file are consumed
|
148 | 152 | to form the next preprocessing token
|
|
178 | 182 | \item
|
179 | 183 | Adjacent \grammarterm{string-literal} tokens are concatenated\iref{lex.string}.
|
180 | 184 |
|
181 |
| -\item Whitespace characters separating tokens are no longer |
| 185 | +\item |
| 186 | +Any \grammarterm{whitespace-character}s separating tokens are no longer |
182 | 187 | significant. Each preprocessing token is converted into a
|
183 | 188 | token\iref{lex.token}. The resulting tokens
|
184 | 189 | constitute a \defn{translation unit} and
|
|
467 | 472 | None of these names or aliases have leading or trailing spaces.
|
468 | 473 | \end{note}
|
469 | 474 |
|
470 |
| -\rSec1[lex.comment]{Comments} |
| 475 | +\rSec1[lex.whitespace]{Whitespace} |
| 476 | +\indextext{whitespace|(}% |
| 477 | + |
| 478 | +\rSec2[lex.whitechar]{Whitespace Characters} |
| 479 | + |
| 480 | +\indextext{character!whitespace|(}% |
| 481 | +\begin{bnf} |
| 482 | +\nontermdef{whitespace-character}\br |
| 483 | + \unicode{0009}{character tabulation}\br |
| 484 | + \textnormal{new-line}\br |
| 485 | + \unicode{000b}{line tabulation}\br |
| 486 | + \unicode{000c}{form feed}\br |
| 487 | + \unicode{0020}{space}\br |
| 488 | +\end{bnf} |
| 489 | + |
| 490 | +\pnum |
| 491 | +\begin{note} |
| 492 | +Whitespace characters are used to separate elements of the \Cpp grammar. |
| 493 | +\end{note} |
| 494 | +\indextext{character!whitespace|)} |
| 495 | + |
| 496 | +\rSec2[lex.comment]{Comments} |
471 | 497 |
|
472 | 498 | \pnum
|
473 | 499 | \indextext{comment|(}%
|
|
477 | 503 | characters \tcode{*/}. These comments do not nest.
|
478 | 504 | \indextext{comment!\tcode{//}}%
|
479 | 505 | The characters \tcode{//} start a comment, which terminates immediately before the
|
480 |
| -next new-line character. If there is a form-feed or a vertical-tab |
481 |
| -character in such a comment, only whitespace characters shall appear |
| 506 | +next new-line character. If there is a \unicode{000c}{form feed} or a \unicode{000b}{line tabulation} |
| 507 | +character in such a comment, only \grammarterm{whitespace-character}s shall appear |
482 | 508 | between it and the new-line that terminates the comment; no diagnostic
|
483 | 509 | is required.
|
484 | 510 | \begin{note}
|
|
489 | 515 | \tcode{/*} comment.
|
490 | 516 | \end{note}
|
491 | 517 | \indextext{comment|)}
|
| 518 | +\indextext{whitespace|)}% |
492 | 519 |
|
493 | 520 | \rSec1[lex.pptoken]{Preprocessing tokens}
|
494 | 521 |
|
|
506 | 533 | string-literal\br
|
507 | 534 | user-defined-string-literal\br
|
508 | 535 | preprocessing-op-or-punc\br
|
509 |
| - \textnormal{each non-whitespace character that cannot be one of the above} |
| 536 | + \textnormal{each non-\grammarterm{whitespace-character} that cannot be one of the above} |
510 | 537 | \end{bnf}
|
511 | 538 |
|
512 | 539 | \pnum
|
|
520 | 547 | (\grammarterm{import-keyword}, \grammarterm{module-keyword}, and \grammarterm{export-keyword}),
|
521 | 548 | identifiers, preprocessing numbers, character literals (including user-defined character
|
522 | 549 | literals), string literals (including user-defined string literals), preprocessing
|
523 |
| -operators and punctuators, and single non-whitespace characters that do not lexically |
| 550 | +operators and punctuators, and single non-\grammarterm{whitespace-character}s that do not lexically |
524 | 551 | match the other preprocessing token categories.
|
525 | 552 | If a \unicode{0027}{apostrophe} or a \unicode{0022}{quotation mark} character
|
526 | 553 | matches the last category, the program is ill-formed.
|
527 | 554 | If any character not in the basic character set matches the last category,
|
528 | 555 | the program is ill-formed.
|
529 | 556 | Preprocessing tokens can be separated by
|
530 | 557 | \indextext{whitespace}%
|
531 |
| -whitespace; |
| 558 | +whitespace\iref{lex.whitespace}; |
532 | 559 | \indextext{comment}%
|
533 |
| -this consists of comments\iref{lex.comment}, or whitespace characters |
534 |
| -(\unicode{0020}{space}, |
535 |
| -\unicode{0009}{character tabulation}, |
536 |
| -new-line, |
537 |
| -\unicode{000b}{line tabulation}, and |
538 |
| -\unicode{000c}{form feed}), or both. |
| 560 | +this consists of comments, \grammarterm{whitespace-character}s, or both. |
539 | 561 | As described in \ref{cpp}, in certain
|
540 | 562 | circumstances during translation phase 4, whitespace (or the absence
|
541 | 563 | thereof) serves as more than preprocessing token separation. Whitespace
|
|
826 | 848 | \end{footnote}
|
827 | 849 | operators, and other separators.
|
828 | 850 | \indextext{whitespace}%
|
829 |
| -Blanks, horizontal and vertical tabs, newlines, formfeeds, and comments |
830 |
| -(collectively, ``whitespace''), as described below, are ignored except |
831 |
| -as they serve to separate tokens. |
| 851 | +Whitespace\iref{lex.whitespace} is ignored except to separate tokens. |
832 | 852 | \begin{note}
|
833 | 853 | Whitespace can separate otherwise adjacent identifiers, keywords, numeric
|
834 | 854 | literals, and alternative tokens containing alphabetic characters.
|
|
1790 | 1810 | \begin{bnf}
|
1791 | 1811 | \nontermdef{d-char}\br
|
1792 | 1812 | \textnormal{any member of the basic character set except:}\br
|
1793 |
| - \bnfindent\textnormal{\unicode{0020}{space}, \unicode{0028}{left parenthesis}, \unicode{0029}{right parenthesis}, \unicode{005c}{reverse solidus},}\br |
1794 |
| - \bnfindent\textnormal{\unicode{0009}{character tabulation}, \unicode{000b}{line tabulation}, \unicode{000c}{form feed}, and new-line} |
| 1813 | + \bnfindent\textnormal{a \grammarterm{whitespace-character}, \unicode{0028}{left parenthesis}, \unicode{0029}{right parenthesis},}\br |
| 1814 | + \bnfindent\textnormal{and \unicode{005c}{reverse solidus}} |
1795 | 1815 | \end{bnf}
|
1796 | 1816 |
|
1797 | 1817 | \pnum
|
|
0 commit comments