Skip to content

CollationsOfLocale: Reconsider calling DefaultLocale #1053

@anba

Description

@anba

CollationsOfLocale calls DefaultLocale, because that's what ICU4C returns when calling ucol_getKeywordValuesForLocale("collation", ...). This was noted in:

For example in current implementations:

V8:

$ LANG=en ./d8 -e 'print(new Intl.Locale("abcdefgh").getCollations())'
emoji,eor
$ LANG=de ./d8 -e 'print(new Intl.Locale("abcdefgh").getCollations())'
emoji,eor,phonebk
$ LANG=sv ./d8 -e 'print(new Intl.Locale("abcdefgh").getCollations())'
emoji,eor,trad
$ LANG=zh ./d8 -e 'print(new Intl.Locale("abcdefgh").getCollations())'
emoji,eor,pinyin,stroke,zhuyin
$ LANG=ko ./d8 -e 'print(new Intl.Locale("abcdefgh").getCollations())'
emoji,eor,searchjl,unihan

JSC:

~$ LANG=en ./jsc -e 'print(new Intl.Locale("abcdefgh").getCollations())'
emoji,eor
~$ LANG=de ./jsc -e 'print(new Intl.Locale("abcdefgh").getCollations())'
phonebk,emoji,eor
~$ LANG=sv ./jsc -e 'print(new Intl.Locale("abcdefgh").getCollations())'
trad,emoji,eor
~$ LANG=zh ./jsc -e 'print(new Intl.Locale("abcdefgh").getCollations())'
pinyin,stroke,unihan,zhuyin,emoji,eor
~$ LANG=ko ./jsc -e 'print(new Intl.Locale("abcdefgh").getCollations())'
searchjl,unihan,emoji,eor

While it's possible to mirror the ICU4C behaviour when implementing this in ICU4X, maybe the spec should be changed to not call DefaultLocale?

It's easy for ICU4C-based implementations to detect when the fallback to the default locale was used (ucol_getKeywordValuesForLocale returns the status code U_USING_DEFAULT_WARNING), so no large code changes will be needed.

Proposed spec:

  1. If loc.[[Collation]] is not undefined, then
    1. Return CreateArrayFromList(« loc.[[Collation]] »).
  2. Let language be GetLocaleLanguage(loc.[[Locale]]).
  3. Let r be LookupMatchingLocaleByPrefix(%Intl.Collator%.[[AvailableLocales]], « loc.[[Locale]] »).
  4. If r is not undefined, then
    1. Let foundLocale be r.[[locale]].
    2. Let foundLocaleData be %Intl.Collator%.[[SortLocaleData]].[[]].
    3. Let list be a copy of foundLocaleData.[[co]].
    4. Assert: list[0] is null.
    5. Remove the first element from list.
  5. Else,
    1. Let list be « "emoji", "eor" ».
  6. Let sorted be a copy of list, sorted according to lexicographic code unit order.
  7. Return CreateArrayFromList(sorted).

Metadata

Metadata

Assignees

No one assigned

    Labels

    c: textComponent: case mapping, collation, properties

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions