Skip to content

Latest commit

 

History

History
300 lines (158 loc) · 33.4 KB

File metadata and controls

300 lines (158 loc) · 33.4 KB

2024-08-22 ECMA-402 Meeting

Logistics

Attendees

  • Shane Carr - Google i18n (SFC), Co-Moderator
  • Ben Allen - Igalia (BAN)
  • Elango Cheran - Google i18n (ECH)
  • Henri Sivonen - Mozilla (HJS)
  • Eemeli Aro - Mozilla (EAO)
  • Frank Yung-Fong Tang - Google i18n, V8 (FYT)
  • Richard Gibson - OpenJS Foundation (RGN)
  • Yusuke Suzuki - Apple (YSZ)

Standing items

Status Updates

Updates from the Editors

BAN: I've been trolling through the backlog. I posted some comments about whether we want to revive older issues. There are a few highlights from merges in the past month. #907 starts using literal Unicode instead of HTML entities more consistently. Also other PRs fixing older issues.

Updates from the MessageFormat Working Group

EAO: We've been taking feedback from the technology preview. Effectively finalizing the Intl.MessageFormat [ ?]. This is something that some input from 402 would be useful for.

The week starting 9 September we have a MFWG work week. The aim is to resolve open issues.

See if we can effectively finalize MessageFormatv2 this fall. The reason I’m asking here is that I am not personally sure whether we should do that, especially since we’ve gotten relatively little feedback from implementers or localizers who weren’t previously engaged. We have discussion with TG5 for getting user study done through them on MFv2. Obviously if that arrives after we’ve set the spec to a specific shape, there’s little we can do at that point. We’ve discussed, but not asked for an early w3c TAG review, which could be informative in determining whether what we have is good/complete enough to finalize. What I’m asking for is whether any of you have feelings/thoughts on this, and would like to advertise the week of 9 sept in particular for participation in final discussions, since we’ll be deciding that week on whether to end that preview. There’s stuff we’re working on at the moment, particular interest in the specifics on how exactly the functions we have in the spec built in (number, integer, datetime, date, time, string) work, and how exactly are the options of these defined currently. Currently required set is a subset of what we have in 402 formatters, with one minor deviation in the number function – it’s being used for both number formatting and selection, meaning that in Intl.PluralRules we have type: cardinal and type: ordinal, but MFv2 has select as the name of the options, with the values being “exact”, “plural”, and “ordinal.” Currently discussing set of options / functions that the spec will include not as required but as effectively recommended. Seeing if we should have entire set of what’s in 40, or some subset thereof. If interest in not including some options that 402 has in the supported or recommended or required options for MFv2, input is useful. Example: we have hour12 and hourCycle both as options for datetime formatting, and I think they mostly do the same thing. Input on this would be welcome.

SFC: A lot of stuff in that update! May be useful to identify individual open questions. Sounds like you’re looking for additional final feedback, specific issues like one raised at end about hour cycle, and about how the built-in functions interact with Intl, I definitely have thoughts about this, CLDR design working group has been talking a lot about hourCycle recently. One thing i specifically want to talk about is that I had raised the question of semantic skeleton datetimeformat data, I have a draft of spec that I’ve shared, don’t know if you’ve seen, but this would be something that would be useful to look into. Now there’s a specification for semantic skeleton, so that’s not a excuse not to use it. Also discussion of wrapping types together with values (?) to pass into function. We can still explore. Is there action we should be taking at this end?

EAO: First of all, none of the things I mentioned are Intl.MessageFormat issues. They are MessageFormat 2.0 issues. That is underlying the Intl.MF impl, but the important thing on which feedback would be desirable is, how would TG2 want to communicate the readiness of the specification? Everything else follows from there in levels of importance. For semantic skeleta, I would not want the MF spec to require the implementation of anything that we don’t have an already-existing underlying implementation that’s already in stage 4 for JS, since if we require something this will no longer be for ICU implementations or ECMA-402 implementations, but would be for all implementations. The bar for getting semantic skeletons specified even as a recommended thing would require spec for semantic skeleta to be itself complete. THat’s under CLDR as well, or elsewhere?

SFC: Comment reflects misunderstanding about semantic skeleta. Only an API for accessing options for DTF, they don’t tread into how to process those recommendations. What we’ve landed on is a semantic skeleton is a field set and some stylistic options. Specific list of what field sets allowed (year month, month day, not year and day of month, for example). We give specific list of allowed field sets, and then options to configure those (lengths, era display), you can use different ways to do hour cycle, one of the open questions on messageformat. Semantic skeleta specify what the options are, once specified, you have a recommended way to map patterns into formatting, ICU4X implements that. It’ll take 20 LoC to map from classical skeleton to semantic skeleton, in terms of using implementation that’s there. Doesn’t require anyone to implemented, can be implemented in DF. If ICU4X implements, that could be directly used, but it’s not a prerequisite for using semantic skeleta. All the other things there for date formatting, like date skeletons – this isn’t an ECMAScript issue, but is MF related – ICU4X doesn’t currently have classic skeleta, have removed it, if people complain that semantic skeleta doesn’t do everything we want, we could look into adding a classical skeleta API, but I don’t want MFv2 to require us to support old-style skeleta now that we have semantic skeleta available.

SFC: Also thoughts on hour cycle. We’ve been discussing hour cycles in the CLDR design working group almost weekly.

EAO: It would be useful for you to raise an issue about semantic skeleta in the MF working group repo, so we can consider it as a whole. Currently spec doesn’t have anything about any skeleta, partly because that is not a subset of what we support in JS. So it is entirely possible for us to introduce semantic skeleta recommended option / shape for it if the spec is well-defined, but we should not spec it as required until/unless it is supported by us. I would want the discussion of whether to support datetime semantic skeleta to be done in a datetime formatting context rather than a message formatting context, so that we don’t have a weird situation where semantic skeleta can be used, but only in MessageFormat rather than DateTimeFormat. Should semantic skeleta be the API in DateTimeFormat and MessageFormat? Starting this discussion, though, requires the issue being raised in the MF repo, also against Intl.DateTimeFormat.

EAO: Main takeaways:

  • Work week September 9, Consider whether to finalize MFv2 spec this fall or leave to later.
  • Input from 402 highly desired, especially as TG5 user research study as yet incomplete.
  • Would be appropriate for others in this group to participate in TG5 issue, since TG2 is audience of results of study

SFC: I would like if the TG5 study has activity – ideally is complete – beforehand. I see the TG5 study as prerequisite for Intl.MessageFormat, but also v2, we don’t want to have a situation where we get info post-stabilization.

EAO: If that is the common sense of this group, it should be communicated clearly, not just for me. Earlier it can be accounted for, earlier it can shape our work.

SFC: We could make a formal recommendation from this group – we have enough of a quorum. I would favor that. Maybe we should wait for FYT to get back on the call to make a formal recommendation.

Updates from the W3C i18n Group

BAN: I've been attending the meetings but it hasn't been of direct relevance to us. We could perhaps add a "ships entire payload" agenda item to next week's meeting.

HJS: Is the W3C group generally reviewing ECMA-402 API designs?

BAN: TG1 wants us to not expose information via Intl not beyond what is exposed by the user agent string. We've been talking about how we could potentially add additional locales on demand.

HJS: Does the request take timing side-channels into account?

SFC: Yes, it does, but we need to get it validated by a security team.

HJS: Is this about the Cantonese issue? I don't think we should design anything with the assumption that we couldn't solve the Cantonese issue by just merging the dictionaries somehow.

BAN: SRL has comment about Thai script breaking issues.

#780 (comment)

SFC: I agree with HJS, I believe. This is not necessarily the Cantonese issue, but more about whether users are able to load in something like a subsaharan africa language pack, or a native american language pack, languages that wouldn’t be available by default but that you could install dynamically. In terms of next steps, I do think we should raise this with W3C. One thing we’ve said for a long time is that there’s more privacy experts there than in Intl, and therefore we should be engaging more with W3C. Need to engage with privacy professionals more generally – take a more active role in this. Also might be good to mention at TPAC in Anaheim.

Updates from Implementers

https://github.com/tc39/ecma402/wiki/Proposal-and-PR-Progress-Tracking

FYT: We are shipping DurationFormat! Not yet released, branch was cut, should hit general user around late September.

SFC: One question about V8 asked in Temporal meeting that I didn’t have answer for: any update about monthcode / errorcode? Any updates about Temporal implementations from V8 side?

FYT: Not recently.

SFC: We should probably move on errorCode/monthCode. Also, Temporal champions made announcement in last TG1 saying they consider Temporal ready to ship and will stop making changes now, so that implementations can start implementing it. Need to solidify errorCode/monthCode, but need to also talk about implementation choices. It would be nice if we could start moving the needle again on the Temporal implementation, Temporal champions anxious about that.

FYT: Sure. We’ll see if the spec is stable, or if they just say it’s stable. The reality is that the spec keeps changing, though. We’ll see.

YSZ: Did you see about idea how they attempt to change the spec further? I think Temporal was having significant amount of changes despite being in Stage 3, and we cannot implement that type of feature.

SFC: Temporal champions are aware of this, in TG1 a few weeks ago they announced that spec was stable / ready to be implemented. I’m on the working group, there’s been two meetings since then, no additional changes discussed. In terms of bigger changes, and even smaller editorial changes, they aren’t planned. If there are changes, they’ll be along the lines of “FYT found a bug in the spec.” At this point Anba, who is our spec-bug-finding czar, has done a lot of legwork on implementing it in Firefox, it’s already gone through multiple passes of people looking for bugs in the spec, chance that big new bugs found is small.

YSZ: Thank you so much!

Proposals and Discussion Topics

Recommendation about TG5 study and MessageFormat 2.0

tc39/tg5#3

SFC: We have a TG5 study that’s open, but hasn’t moved. The TG5 study impacts Intl.MessageFormat – the point is to look at what shape of a MessageFormat specification will be most useful for users on the web platform, that’s the nature of the study. This study blocks the Intl.MessageFormat API. The question does it also block MessageFormat 2.0. Let’s say it comes back whenever it comes back, but it says “we mostly like the MessageFormat API, but there’s also a thing that impacts MF 2.0”. I don’t want to be in a situation where our hands are tied, and we must say “it’s already frozen, we can’t change it based on your feedback.’ I would rather be in a situation where the study comes back and we can take the input from the study and bring it into MessageFormat 2.0 on Unicode. The implication is that we don’t want to take MessageFormat out of tech preview until the TG5 study is done. TG5 study is in critical path. Do we want to make it a formal recommendation from TG2? If so, EAO and ECH can bring back, perhaps get more momentum on getting the TG5 study under way

EAO: I would support what SFC is recommending, noting that my understanding is that the scope of the TG5 study is intended as the full API, especially about including the string syntax. If we wanted to remove that study from the critical path for finalizing MF2, then we would need to explicitly ask for that study to exclude the syntax. And I wouldn't want that.

ECH: I concur; the linkages and dependencies that have been stated that would put this on the critical path all makes sense – the reasoning all makes sense. There is an issue that Addison filed in the MessageFormat repo to ask W3C for feedback. I haven’t created an issue, but brought up a similar question / affirmation and got a positive response that we should collect feedback from the ICU TC and other Unicode WGs/TCs. Having a proper study would be the right way to do it.

SFC: We had previously made a recommendation – I don’t know if it’s a formal recommendation – that the understanding of this group is that MessageFormat should include a syntax, and is not motivated without a syntax. This study very much goes along with that. Having this study in hand is a way to push to have the syntax included in the Intl.MessageFormat proposal. I do think the study should remain in the critical path, with the syntax in scope for TG2’s purposes, which means it should remain in scope for Unicode’s purposes. Let’s write a recommendation.

ECH: I was also considering whether the scope can be expanded, which isn’t something my management chain could support. I don’t know exactly what the status is of the study effort.

EAO: SFC, if you could file an issue with this position in the MF repo, it will make it easier for us to make sure that it gets noted there and discussed.

SFC: Two action items: TG5 study, and semantic skeleta.

SFC: Shall we iterate on text of recommendation before we approve it as a group?

EAO: Maybe say “graduating from tech preview”?

ECH: This is more precise, and more precisely matches what we discussed.

Recommendation

  • TG39-TG2 requires that a TG5 study be complete before moving forward on the Intl.MessageFormat proposal. The study's scope should determine what shape of Intl.MessageFormat API and syntax delivers the most value to Web developers.
  • Because Intl.MessageFormat depends on Unicode MessageFormat 2.0, TC39-TG2 recommends that the TG5 study be completed and feedback from the study integrated into Unicode MessageFormat 2.0 before that specification graduates from tech preview.

LGTM: SFC, ECH, YSZ, EAO, FYT

DurationFormat merge issue

tc39/proposal-intl-duration-format#192

FYT: Issue 192 already got approved, and I made PR 198. I couldn't merge PR 198 due to a merge conflict, so BAN made a new commit, but it got merged directly to trunk. I think we should get approval on it.

BAN: Oh, sorry, I looked and I couldn't find where in TG1 this got approved. I can roll back the change.

SFC: I think we should keep the change in there and just ask for post-merge approval.

FYT: We also need to write a test.

SFC: With regard to issue 192 in DurationFormat, there was Google summer of code participant who was looking at that. I saw that these had landed on trunk with no PR landing. I didn’t see it land either, that’s why it’s good to have PRs in other words.

Action Item

BAN to add this to the TG1 agenda.

Fix bug with RelativeTimeFormat not correctly passing “numberingSystem” option

#919

SFC: This is a web-compatible change, but not an editorial change – it is a normative change.

FYT: Yes, if there’s a polyfill that implements this thing, their behaviour could change.

SFC: Do we give this our TG2 approval? If FYT, YSZ not ready to give approval, you can comment on the issue?

YSZ: I'm pretty much fine with the change, yes. I will also comment on the issue.

Explicitly designate sort order for plural categories

#918

BAN: [presents issue]

SFC: I believe we should bring this to TG1 for consensus in Tokyo

YSZ: Looks good to me

RGN: I like it

FYT: What is the web reality?

BAN: No implementation currently uses this order. The order is not currently in the spec. I think everyone is using the same order, but that order is not comprehensible.

FYT: So there is a web compat risk, but the risk might not be significant. Is the behavior testable?

BAN: Yes, by testing the order of the return value order.

SFC: Just pick a locale and test the order of the return value.

FYT: I'm fine with this, then.

Inline GetOperands into PluralRuleSelect

#635

BAN: [presents issue]

RGN: GetOperands returns a Operands Record with information extracted from the input string, extractions as specified by UTS 35. GetOperands in ResolvePlural gets back the Record, calls PluralRulesSelect with that Record, also with the input number. PluralRuleSelect has access to the full input; if they want to use something like GetOperands, they can. But given the nature of the object, they already can do arbitrary things. If we want to restrict that behaviour to only use data from GetOperands, we need to remove the n parameter from it.

SFC: And GetOperands itself contains the number?

RGN: It contains the absolute value of a number that’s the result of parsing the input string. Not guaranteed to match.

SFC: In ICU4X we found that we don’t actually need that n parameter, since it can be reconstructed from the other items. Nothing from CLDR uses the n operand, ICU4X drops the n operand entirely.

RGN: To clarify: if you look at ResolvePlural, it takes an input number, formats it using configuration associated with the PluralRules instance, takes that formatted number and breaks it apart with GetOperand, then it passes into PluralRuleSelect the result of GetOperands, but also the number before any formatting was applied. This is going to require some exploration of how implementations currently behave, but it’s not clear to me how we want them to behave either. If we were to strip away the n parameter from PluralRuleSelect, this would mean that they’re required to pick a plural rule based on the post-formatted number. This could have stripped away fractional digits entirely, for instance. If I say “no fraction digits allowed in my configuration” and I pass in 1.1 vs 1.0, the formatting is going to give the same string for both of those. The PluralRuleSelection would no longer be aware that the original input number has a fractional part.

SFC: The ILND strings come in PartitionNumberFormatPattern – that’s fine, I got distracted by that briefly. I think it should depend on [[FormattedString]] – if we change it to depend on [[FormattedString]], that’s fine. It should not depend on n at all.

RGN: The scenario it matters is 1.01 vs the number 1. Passing into PluralRuleSelect with configuration that strips away fractional parts for formatting, are they required to match?

SFC: The implementations always operate on the formatted string – never on the non-formatted number.

RGN: If that is in fact the case, it would make sense to not inline this, but instead remove the n parameter, and probably remove the n field from the Get Operands Result record. To make it clear that number gets formatted, result of formatting gets deconstructed, value from deconstruction is sent to PluralRuleSelect.

SFC: This flow aligns with my conception of how it could/should work. Open to considering whether deconstructing step should be stepped, but should definitely not be the original number.

RGN: Then we could definitely strip the deconstruction – mechanical process operating on the string, could be hidden in implementation

[missed]

SFC: If we allow mathematical values / BigInts, this could cause loss of precision, in which case we actually have different scope. We would have different – it would be a different beast. But GetOperands is a lot of steps that I don’t want a polyfill to have to repeat. I am tentatively in favor of taking FormattedString the only argument. BAN to make PR to make FormattedString the only argument, make that be the PR to ask for consensus on, once that’s up Zibi and others can weigh in.

RGN: The shape of the PR would be: remove GetOperand operation, remove step 6 in ResolvePlural, update step 7 with one argument ([[FormattedString]]). This matches a simple exploration of what I just did: the PluralRules selection is applying on the formatted string as far as I can tell, so that is behaviour conformant.

Smart Units Update

https://github.com/tc39/proposal-smart-unit-preferences

BAN: [gives presentation]

EAO: I might be interested in co-championing a units conversion proposal for 262

BAN: I might be interested in that.

EAO: I think a units conversion proposal should land in parallel with the Intl API because otherwise we will get abuse.

SFC: I don't share as strong of a position as EAO but I agree that a units conversion proposal would address the abuse concerns. I think such a proposal should be very, very limited in scope. For example, it should handle conversions in floating point space.

EAO: person-height is one of the units we want, and with [?] you can’t represent feet-and-inches

SFC: We need to think a lot about how we do mixed-unit representation. I think formatting of mixed units is something I had put on the agenda for today, since mixed-unit formatting is a piece that needs to be pulled out of the smart units proposal. It’s a small piece to start with, that could work nicely with the units representation proposal. We could have a type called “measure” which is where the unit conversions could live, encoded in floating-point numbers, supporting mixed units, used in messageformat, used for conversion, that could be where we could start – adding such a type. This would be the type we’d be able to support the formatting of, and this could address EAO’s concern. The scope could be whatever the type supports. Also resolves the question of how to pass a mixed unit. ICU currently formats in the largest unit – so for 5 feet 11 inches, you’d pass in whatever that was in just feet. Having a mixed-unit class that represents a number with a unit is probably a good approach. Could consider doing decimal, but if Decimal proposal doesn’t go forward, use Number. Not mathematical value – want to make sure we constrain the scope.

EAO: Something like a Measure class that could also do conversions sounds like it could be a decent solution to this. As mentioned before, I'd be happy to co-champion such a proposal.

SFC: One of the purposes of this was figuring out the direction to take this proposal. I’ve been pushing for a time the idea that we should keep it as a black box, but given the concerns raised by EAO about abuse, and the concern about MF needing an upgraded type, and given the issue of needing a good way to pass mixed units into a formatting API in the first place, I’m coming around to having a type that wraps together a value with the associated units might be a good way to go forward, and that’s the proposal we should explore first, supporting regular units and mixed ones, and what we could use for MessageFormat as well. That’s a full-sized proposal, if BAN and EAO worked together on championing this proposal, that could be the main deliverable. Put Smart Units on the shelf for a year with the understanding that it’s built on this other smaller proposal. I don’t think TC39 should pass proposals that aren’t motivated in and of themselves – proposals need to carry their own merit – but this one does carry its own merit regardless of whether we move forward with smart units.

BAN: There's also the en-US problem. Many people use en-US localizations for their browsers, but en-US is also one of the worst locale for unit localizations. For example, I read that 60% of users from Indonesia use en-US localizations.

HJS: I have an action item to write something up in this area, based on a discussion I had previously with SFC. My status is that I haven't forgotten about this and I hope to have a document to share at some point.

SFC: I think it is productive to focus the user preferences discussion on what solution can narrowly address the smart units proposal.

BAN: This discussion has been tremendously helpful; thank you.

Intl.DateTimeFormat does not support 'und' locale #885

#885

SFC: Intl.DateTimeFormat and most Intl formatters don’t support "und" locale. When you pass that, you get ‘en-US’ results, or whatever your browser localization is. This is sad. It’s a problem and we should do something about it, but before we do something about it we should discuss what we want to do about it.

EAO: So it’s not just DateTimeFormat or Segmenter that doesn’t support the "und" locale. Could be solved by something like the null locale proposal as part of the other proposal. But what would a "und" locale formatted datetime look like?

SFC: A couple of ways: 1) say that "und" is the null locale, since we don’t have behaviour for it. Since no browser does anything useful with it, we could just define "und" as the null locale, and define behaviour for it. Maybe ISO 8601 as the formatting for it. 2) Alternately, use the data in CLDR, not ISO 8601, weird format no one understands, but it is stable and we could also do that.

RGN: Similar comments to what EAO just raised. Also, IIRC, "und" isn’t defined where it needs to be. I think when we looked at this previously, there’s some registry it’s not a part of. It exists as this way-downstream concept, rather than being defined in the right place, and I wouldn’t want to squat on someone else’s registry. If we were to pursue this – which I think makes sense – I would require that it is defined in the right place so that we could have future forward-compatibility guarantees.

HJS: It seems to me that we need a kind of answer that results in ISO-8601 datetime formatting, and then some hand-waiving for long versions that normally, when there is a language, result in words like "Monday". If there is something more unusual in CLDR under "und", then we probably should not be exposing that to the web. On the other hand, for other stuff in CLDR when the root of something does make sense for global fallback, like collation, then it seems fine to use the CLDR definition of "und". In other words, if "und" is strange for a component like datetime, we shouldn't be exposing it, and we should use spec gymnastics to get a less strange outcome.

EAO: I have to briefly refresh myself on stable formatting. But there indeed the specific reason that stable formatting is proposing a null locale where null is an alias for ZXX (the code for “no linguistic content”) is that no one defines anything for ZXX. Defining this explicitly in JavaScript for 402 is that it would not create a new understanding for what date time formatting is. I agree that we shouldn’t introduce a new weird thing, and given that CLDR (and through that many systems) of "und" date time formatting following a specific shape, it would be surprising for some users for JS "und" date time formatting to be something different. My proposal for solving this would be to define the behaviour of a null/ZXX locale saying explicitly that it uses ISO 8601 for its date-time formatting.

HJS: Looking at the https://www.ietf.org/rfc/bcp/bcp47.html#page-1-56 link from chat pointing to BCP 47, it sounds like it depends on a for-API basis for 402. “No linguistic content” is a better semantic split – there’s already split between APIs in 402 that ingest natural language and that produce natural language – maybe that’s the split, maybe it’s confusing if we have different no-locale identifiers for different classes of API, so that we don’t use the wrong semantics given the definitions in BCP 47. There’s a possibility that the right answer here is that the APIs that consume natural language vs the APIs that produce natural language end up with the different identifier for the “I opt out of stating a language” thing.

SFC: It sounds like we don’t want "und" to produce the garbage that’s currently in CLDR, but we may or may not want "und" to map to the data that we get from the BCP 47 locale that we might use (ZXX). We do need to think about "und" doing something – right now it does what I think is the worst thing, which is loading the user’s locale. Unless we think that’s okay – "und" means undefined language, so we fall back to the user’s locale? I don’t think that’s an explicit decision we’ve ever made, though. So "und" can do one of three things: 1) CLDR root formatting 2) ISO 8601 formatting, or 3) "und" means user locale for purposes of formatting. I agree with HJS that "und" might make sense for Collator but not for DateTimeFormat

EAO: If we were advancing a null locale, suggest in certain places that "und" is another alias for ZXX.

HJS: The current behaviour that 402 does not know about "und" and so we fall back to the browser locale is pretty bad, and easy to misunderstand, especially when using APIs that take natural-language input, and the browser you yourself as a developer, if that happens to be one that doesn’t have any tailorings, it looks to you that "und" means the thing you think it means, but it doesn’t actually mean that. So I think option 3 is the worst of the options.

SFC: I both agree but also realize that there’s not a locale that means “use the system default locale”, other than the keyword “undefined”. Given that the string has that behaviour on the web platform, we could decide that "und" means that string. But I agree that that’s horrible behaviour.

RGN: Is it horrible behaviour? It is the behaviour that you get when the locale is undefined. I don’t see that it’s bad to treat an explicit "und" as the same as an explicit “undefined”.

HJS: My thinking is maybe too focuse on collation being the 402 API that I've implemented, and "und" making sense as the identifier for the root collation. I made this mistake myself thinking that "und" means the root locale while testing with an en-US browser. But it is true that "und" meaning UI locale is the current behavior. It's still pretty bad that we don't have an explicit way of invoking the root.

RGN: Based on my incomplete knowledge of collation, wouldn't it make sense to use "root" as the value for getting the root collation? With the understanding that it is rejected as invalid.

HJS: Would it be okay to mint a string outside of BCP47?

RGN: "root" is already specified in UTS 35.

FYT: I’m a little bit – the “und” is defined as ISO as “undetermined” language code. If it’s defined as undetermined, how can we define behaviour for it? It’s logically illogical.

SFC: Looking at the IETF definition of what’s undetermined… page 56.

SFC: Given that the definition of “und” is the language is undefined, it’s totally fair for browsers to say that they’ll use the system locale, since that’s the best we can infer. It is also coincidental that "und" is the first 3 letters of the keyword "undefined", which has that same behavior. I think that we could potentially say that “null or ZXX” is this other locale that has the behaviour that HJS proposes – collation with root behaviour – and maybe that’s the best path forward.

RGN: I’m glad you brought that up – I was refreshing myself on the Unicode treatment, and that’s true. UTS 35, and everything downstream from it, including CLDR, is treating “und” as a means for addressing the root locale, which is not what it means in BCP-47. Given the flexibility in BCP-47, it can be applied, but I don’t think it’s a good first choice, as opposed to ZXX which hasn’t been homesteaded in this way, and that explicitly means “non-linguistic” and so open for definition as something stable. I understand having read that why “und” would be a choice for someone who is well-versed in ICU and the fault is not ours, it’s UTS 35.

SFC: No conclusion reached, but this has been clarifying. We can address this at the same time as the Stable Formatting proposal.

EAO: As something of a last word, it sounds like in a near-future TG2 call we should see if we can find consensus on a Stage 1 update on a Stable Formatting proposal that proposes creating a null or ZXX locale, for instance to solve this problem.

SFC: Agreed. This might be a good topic for the face-to-face.