Fix Encoding considerations description of codepoint escape sequences. (#339)

kasei · web-flow · commit 0bf8c674982d · 2026-01-15T08:47:58.000-08:00
The previous text suggested that the \U escape sequence form could only be used
with codepoints starting at U+10000, and that both \u and \U forms had to use
uppercase hexadecimal characters. This commit updates to correctly reflect that
\U escapes can use any codepoint (U+0 to U+10FFFF) and the hexadecimal
characters are case-insensitive.
diff --git a/spec/index.html b/spec/index.html
@@ -12674,7 +12674,8 @@ <h2>Internet Media Type, File Extension and Macintosh File Type</h2>
           <dd>The syntax of the SPARQL Query Language is expressed over code points in Unicode
             [[UNICODE]]. The encoding is always UTF-8 [[RFC3629]].</dd>
           <dd>Unicode code points may also be expressed using an \uXXXX (U+0 to U+FFFF) or
-            \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-F]</dd>
+            \UXXXXXXXX syntax (U+0 to U+10FFFF), where X is a hexadecimal digit [0-9A-Fa-f],
+            excluding U+D800 to U+DFFF, the <a data-cite="I18N-GLOSSARY#dfn-surrogate">surrogate code points</a>.</dd>
           <dt>Security considerations:</dt>
           <dd>
             See SPARQL Query appendix C, <a href="#security">Security Considerations</a> as well as