Skip to content

Unicode handling in xterm.js #1709

Closed
@jerch

Description

@jerch

Coming from #1707 it seems the correct unicode handling is more and more an issue for people due to emojis. Since we all love emojis this should get fixed ASAP 😄

Proposal:
Create a provider for different unicode versions, that is capable of hiding the version specific data and implementations behind a nice API. Currently we only need version dependent implementations for wcwidth, so a rough sketchup could look like this:

interface IUnicodeProvider {
  supportedVersions(): string[];
  getVersion(): string;
  setVersion(version?: string);  // version optional for fallback behavior
  wcwidth(ucs: number): number;
  getStringCellWidth(s: string): number;
  ... // more to come with support of other unicode features
}

Ideally the provider is self containing, thus the terminal just needs to deal with the interface methods and updates the version/locale when needed. The provider would have to deal with the low level stuff to provide the correct data sets so the methods just work as expected for a supported version.
Within the provider we then can decide whether the data is provided statically in the code base or even tries to create the data on the fly. First will have quite an impact on xterm.js' size, the second will raise async questions (remember - most of the core parts are synchronous atm). The whole unicode stuff could also be bundled into some addon like feature for version XY.

Up for discussion.
/cc @Tyriar, @bgw, @mofux, @dnfield

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions