You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: implementers-tips.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,11 +8,11 @@
8
8
- ㊗ (U+3297)
9
9
- ㊙ (U+3299)
10
10
- Do not treat every character in [emoji-data.txt](https://www.unicode.org/Public/UCD/latest/ucd/emoji/emoji-data.txt) in the below data list as emoji. It includes ASCII digits, ASCII asterisk, ASCII hash sign, copyright symbol, trademark symbol, and so on. They should not be treated as emoji unless followed by a U+FE0F. We have to extract only characters with the `Emoji_Presentation` label.
11
-
- You can use `/^\p{Emoji_Presentation}/u`, or `/^\p{Basic_Emoji}/v` or `/^\p{RGI_Emoji}/v` in JavaScript to check if a code point is an emoji (as a default emoji presentation character or in the RGI emoji set). __`RGI_Emoji` characters other than `Basic_Emoji`__ ([basic emoji set](https://www.unicode.org/reports/tr51/#def_basic_emoji_set)) __have multiple code points and are not CJK as of Unicode 16. Never use `/^\p{Emoji}/u`__ instead of them because it is useless due to the fact that `/^\p{Emoji}/u.test("1")` is `true` (who on earth would insist that `1` is an emoji?). The `v` flag is available since ES2024 and supported by Node >= 20, Chrome (Edge) >= 112, Firefox >= 116, and Safari >= 17.
11
+
- You can use `/^\p{Emoji_Presentation}/u`, or `/^\p{Basic_Emoji}/v` or `/^\p{RGI_Emoji}/v` in JavaScript to check if a code point is an emoji (as a default emoji presentation character or in the RGI emoji set). __`RGI_Emoji` characters other than `Basic_Emoji`__ ([basic emoji set](https://www.unicode.org/reports/tr51/#def_basic_emoji_set)) __have multiple code points and are not CJK as of Unicode 17. Never use `/^\p{Emoji}/u`__ instead of them because it is useless due to the fact that `/^\p{Emoji}/u.test("1")` is `true` (who on earth would insist that `1` is an emoji?). The `v` flag is available since ES2024 and supported by Node >= 20, Chrome (Edge) >= 112, Firefox >= 116, and Safari >= 17.
12
12
-`"ES2024"` as `"target"` and `"lib"` in `tsconfig.json` is supported by TypeScript >= 5.7, Vite >= 6, and Vitest >= 3. You should use `"ESNext"` instead of `"ES2024"` for older ecosystems.
13
-
- There are no emojis whose East Asian Width is `F` or `H` as of Unicode 16.
13
+
- There are no emojis whose East Asian Width is `F` or `H` as of Unicode 17.
14
14
- The East Asian Width of Ideographic Variation Selector and Standard Variation Selector is `A`.
15
-
- The East Asian Width of characters whose Script is Hangul can be `N` (U+1160–U+11FF). However, there are no characters whose Script is Hangul and East Asian Width is `A` or `Na` as of Unicode 16.
15
+
- The East Asian Width of characters whose Script is Hangul can be `N` (U+1160–U+11FF). However, there are no characters whose Script is Hangul and East Asian Width is `A` or `Na` as of Unicode 17.
16
16
- You can use `/^\p{sc=Hangul}/u` in JavaScript to check if the Script of a character is Hangul.
17
17
- The East Asian Width of unassigned characters (e.g. U+3097) is undefined. You should follow the [guideline by Unicode](https://www.unicode.org/reports/tr11/#Unassigned). Note that U+2FFFE–U+2FFFF and U+2FFFE–U+2FFFF are Noncharacter, not Reserved (Unassigned). The East Asian Width of Noncharacter does not seem to be mentioned in the specifications of the East Asian Width property. Therefore, you can treat them as `W` to join two product terms for U+20000–U+2FFFD and U+30000–U+3FFFD.
18
18
- The Unicode category of Ideographic Variation Selector and Standard Variation Selector is `Mn`, not `P` or `S`. It means there is no [Unicode punctuation character](https://spec.commonmark.org/0.31.2/#unicode-punctuation-character) or [non-CJK punctuation character](#non-cjk-punctuation-character) that is also Standard Variation Selector or Ideographic Variation Selector.
## EAW is treated as "W" if unassigned (defined by Unicode)
443
443
444
444
> [!NOTE]
445
-
> The following result is extracted from https://www.unicode.org/Public/16.0.0/ucd/EastAsianWidth.txt. It is slightly different from https://www.unicode.org/reports/tr11/#Unassigned. U+2FFFE, U+2FFFF, U+3FFFE, and U+3FFFF are missing, but [they are "Noncharacter"](https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-23/#G19653), not ["Unassigned" (or "Reserved")](https://www.unicode.org/glossary/#reserved_code_point). This shows that we do not have to care about whether they are included in the list of CJK code points or not. To simplify the ranges, U+2FFFE and U+2FFFF are merged to U+20000–U+2FFFD here.
445
+
> The following result is extracted from https://www.unicode.org/Public/17.0.0/ucd/EastAsianWidth.txt. It is slightly different from https://www.unicode.org/reports/tr11/#Unassigned. U+2FFFE, U+2FFFF, U+3FFFE, and U+3FFFF are missing, but [they are "Noncharacter"](https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-23/#G19653), not ["Unassigned" (or "Reserved")](https://www.unicode.org/glossary/#reserved_code_point). This shows that we do not have to care about whether they are included in the list of CJK code points or not. To simplify the ranges, U+2FFFE and U+2FFFF are merged to U+20000–U+2FFFD here.
Copy file name to clipboardExpand all lines: specification.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,7 @@ A <a href="#cjk-punctuation-sequence" id="cjk-punctuation-sequence">CJK punctuat
33
33
34
34
A <ahref="#non-cjk-punctuation-sequence"id="non-cjk-punctuation-sequence">Non-CJK punctuation sequence</a> is a [Non-CJK punctuation character](#non-cjk-punctuation-character) or a sequence of 2 [characters](https://spec.commonmark.org/0.31.2/#character) where the first one is [Non-CJK punctuation character](#non-cjk-punctuation-character) and the second one is [Non-emoji General-use Variation Selector](#non-emoji-general-use-variation-selector).
35
35
36
-
[^svs-range]: The range except for U+FE0E is computed from https://www.unicode.org/Public/16.0.0/ucd/StandardizedVariants.txt (as of Unicode 16) by extracting those that can follow CJK characters. Also, https://unicode.org/Public/16.0.0/ucd/emoji/emoji-variation-sequences.txt shows that U+FE0E can follow some CJK characters.
36
+
[^svs-range]: The range except for U+FE0E is computed from https://www.unicode.org/Public/17.0.0/ucd/StandardizedVariants.txt (as of Unicode 17) by extracting those that can follow CJK characters. Also, https://unicode.org/Public/17.0.0/ucd/emoji/emoji-variation-sequences.txt shows that U+FE0E can follow some CJK characters.
37
37
38
38
> [!NOTE]
39
39
> To see the concrete ranges of each definition, see [ranges.md](ranges.md).
@@ -64,13 +64,13 @@ See [implementers-tips.md](implementers-tips.md).
64
64
65
65
## Unicode data list
66
66
67
-
| Data name | Latest | Unicode 16|
67
+
| Data name | Latest | Unicode 17|
68
68
| --- | --- | --- |
69
-
| East Asian Width |https://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt|https://www.unicode.org/Public/16.0.0/ucd/EastAsianWidth.txt|
| Characters followed by U+FE0E/U+FE0F |https://unicode.org/Public/UCD/latest/ucd/emoji/emoji-variation-sequences.txt|https://unicode.org/Public/16.0.0/ucd/emoji/emoji-variation-sequences.txt|
| Characters followed by U+FE0E/U+FE0F |https://unicode.org/Public/UCD/latest/ucd/emoji/emoji-variation-sequences.txt|https://unicode.org/Public/17.0.0/ucd/emoji/emoji-variation-sequences.txt|
0 commit comments