async parser handler support by jerch · Pull Request #3222 · xtermjs/xterm.js

jerch · 2021-01-20T19:17:10Z

Not being able to use async functions in parser handlers is an unfortunate limitation of the current parser implementation. This PR tries to come up with a solution to enable async support while keeping the parser performant.

TODO:

Shall fix #3218.

jerch · 2021-01-20T19:23:01Z

The early commit can be tested in the browser by marking SGR as async - simply replace:

xterm.js/src/common/InputHandler.ts

Line 2301 in 74ea558

public charAttributes(params: IParams): void {

with

  public async charAttributes(params: IParams): Promise<void> {

~~(Note that our local tests will not run anymore with that, as they heavily rely on writeSync.)~~ Fixed with 4a1bdad.

jerch · 2021-01-22T19:03:59Z

The last commit enables to test async handlers in our inputhandler transparently, which helps to spot regressions while messing around with random async handlers and testing the perf behavior.

The next thingy to reshape it is writeSync, after that the core tests should also work with async handlers.

jerch · 2021-01-22T20:21:47Z

@Tyriar The last commit deprecates writeSync as it wont work with async handlers anymore.

Reminder: this should be noted later on in release notes in case someone relies on it.

jerch · 2021-01-23T12:35:07Z

Done with the basic parser preparations. After some stack handling cleanup the async test case (ls -lR /usr/lib with SGR marked as async) takes only slightly longer (~1.4 times or 29 vs. 21 MB/s), guess this wont get any better.

Next steps:

Implement parser tests for CSI and ESC mixing all sort of sync and async handlers.
Move on to OSC and DCS async support. As written earlier, this is much more involved, as these handlers are fed by 3 methods from the parser via separate subparsers (start | put | end). Imho start can always be treated as sync, much in the sense of a constructor. put is more tricky - from my initial perspective to get proper flow control from async handlers in general, async support in put would be crucial here as huge amount of data may arrive. Not sure yet to go that path, a simpler impl would just treat the whole sequence as being blocked on the end call (which again needs a fixed mem limit during put to avoid OOM and some bailout logic for async workers).

jerch · 2021-02-25T18:19:13Z

@Tyriar @mofux
The PR is ready for a first review.

Tyriar

Mostly looks good, no concerns about stability/correctness considering the amount of testing coverage/changes.

Tyriar · 2021-02-26T21:24:01Z

+   * be used anymore. If you need blocking semantics on data input consider
+   * `write` with a callback instead.
+   *
+   * @deprecated Unreliable, will be removed soon.


FYI this will take a little while to adopt in vscode, it's used currently in the local echo and reconnect features. It's also used in a bunch of tests which I think should be harmless to stay.

Yeah, we also had many tests with writeSync, but those were easy replaced by a writeP implementation:

xterm.js/src/browser/TestUtils.test.ts

Lines 24 to 26 in 26bf1ab

public writeP(data: string | Uint8Array): Promise<void> {

return new Promise(r => this.write(data, r));

}

I used the same methodP idea for testing the methods along the async callstack:

xterm.js/src/common/InputHandler.test.ts

Lines 50 to 56 in 26bf1ab

public async parseP(data: string | Uint8Array): Promise<void> {

let result: Promise<boolean> | void;

let prev: boolean | undefined;

while (result = this.parse(data, prev)) {

prev = await result;

}

}

For the deeper calls it is a bit more cumbersome to get done right, thus my warnings all over the place not to call those methods directly anymore. Note that it always was dangerous to directly inject chunks at those lower levels while Terminal.write still holds data (will mess up parser states even in sync mode), but now with async it is even more dangerous as it might break the proper continuation. Lets hope that ppl dont dismantle the callstack and call the lower methods directly. Also I dont know of any TS way to prevent that, is there an access pattern like friend in C++?

Final note on writeSync - I did not remove it yet, because it will keep working as before as long as there are no async handlers hooked into the parser. It even keeps working with async handlers if pending data does not trigger any async action (thats what I meant with promise islands - you can still sail in sync water normally at high speed, but now there are promise shallow regions where the boat can crash if not used with writeP). Initially I thought about marking async handlers statically, so the parser would have a "map" of promise regions, but that turned out as not being helpful, as it cannot be decided prehand without actually parsing the data. Thus I went with the simpler handling of "whenever a promise is returned, it is an async handler" (effectively sailing on "runtime sight").

Tyriar · 2021-02-26T21:31:09Z

+   * Note: Never call this directly for a running terminal instance in production.
+   * Always use `Terminal.write`, which provides in-band blocking and correct exection order.
+   */
+  public parse(data: string | Uint8Array, promiseResult?: boolean): void | Promise<boolean> {


Does this work, and if so does it match the valid return types?

public parse(data: string | Uint8Array): void | Promise<boolean>; public parse(data: string | Uint8Array, promiseResult: true): Promise<boolean>; private parse(data: string | Uint8Array, promiseResult?: boolean): void | Promise<boolean> { }

Well I dont see much gain from variant 1. and 2. beside easier dealing with a terminal breakdown during tests. Main reason - the proper continuation is placed at WriteBuffer level in then, which InputHandler.parse is not aware of, thus the right execution order cannot be continued from InputHandler.parse directly anymore. I'd favour a more restrictive access pattern here (like friend in C++), but dont know how to do something like that in TS.

In theory during a normal terminal run with valid data in the write buffer the callstack of Terminal.write --> InputHandler.parse --> Parser.parse --> (Osc|DcsParser.end) --> async handler must not be "sidecalled", whenever an async handler got paused, as it has to be resumed from the very top. I dont like the manual stack unwinding done here, but it is the only way I found to keep sync code fast.

Maybe I didn't explain enough. I think the above would give a compile error if parse(data, false) was called, and it would return Promise<boolean> when parse(data, true) is called (instead of void | Promise<boolean>). Just a little thing that could potentially catch bugs by indicating what is valid.

Not sure if I fully understand your idea - do you want to conclude from provided arguments the return type, like narrowing it down? That is not possible, as the fact, whether a promise gets returned, depends on these two rules:

an async handler is registered for a sequence XY (and will not be skipped by an earlier handler)

AND data contains sequence XY

It never depends on the arguments of parse, all these variants can return void | Promise<boolean>:

parse(data: string | Uint8Array): void | Promise<boolean>; parse(data: string | Uint8Array, promiseResult: true): Promise<boolean>; parse(data: string | Uint8Array, promiseResult: false): Promise<boolean>; // which reduces to parse(data: string | Uint8Array, promiseResult?: boolean): void | Promise<boolean>;

In fact the promise return value and the result argument are responsible for the two-sided handling of the stalled callstack as pausing/resuming:

promise return value

ascending path of stalled callstack (on pause)

stack progress is saved at each call level individually

fast unwinding by direct return jumps

promise bubbles up to top level to enable micro- or macrotask exection

micro- or macrotask exection, eventually our promise gets fullfilled and we grab its return value

promiseReturn:

descending path of stalled callstack (on resume)

rewind stack progress from stack saves on each level

bubbles down to occurrence of promise creation, continues innermost resume with return value (indicating whether next handler should be probed as well)

back to sync processing

The only assumption that could be made is for two consecutive parse calls:

call returns a promise - then next call has to be called with the awaited result of that promise in promiseResult for proper continuation and execution order (we are between pause/resume)

parse calls finishs without returning a promise - we are in sync mode, nothing to wait for before we can process the next chunk

Technically the whole handling is a quite explicit way of coroutines as one could write them in C (where it can be expressed even more elegant with static vars 🙈). The different call requirements for sync vs. async is an unfortunate side effect of this explicit coroutine implementation. But as noted earlier, it was the fastest I was able to find (under the assumption of only few to none async handlers ever being registered). True that the full promisified call chain has much simpler call signatures, but it also runs 4 times slower.

... Just a little thing that could potentially catch bugs by indicating what is valid.

Since we can only derive the proper call signature at runtime by knowing the result of the last call, imho TS's static type system cannot help here much.

Sorry for the lengthy explanation, there is still a high chance I misses your idea here.

- reorder exception throwing to be sync to band position - document WriteBuffer._innerWrite - remove any declarations - remove conditional assigments - use faster return value branching in all parsers - remove dead code in benchmark

jerch · 2021-02-27T21:43:40Z

@Tyriar Thx for the review, guess I addressed most points. Also if you have an idea how to apply a more restrictive access pattern like friend in C++ I am all ears.

…default handlers chance to run

jerch · 2021-02-28T13:32:20Z

@Tyriar I changed the continuation of faulty handlers to false to give the default handler a chance to still get called afterwards. I think thats the better default behavior, as most ppl will hook on already handled sequences, thus the sequence should keep bubbling to our default handler, even if the custom async handler error'ed out.

Some open questions:

Do we need a second timeout with a hard rejection?
Currently we have a timeout warning after 5s, that fires only once. Imho this is enough for a careful dev to spot limitations of his handler implementation and to ensure himself, that the promise always gets resolved/rejected in a timely fashion. Not so for a sloppy dev - the code might have dangling promise endpoints due to improper branch testing and such. Here the terminal would block forever, if such an endpoint got triggered.
To avoid that hard blocking scenario in production environments later on we could place a second timeout on a pending promise and let it fail hard/reject after that second timeout. This would keep the terminal interactive/responsive, although the terminal state might be messed up, if the custom handler already altered something prior promise creation (or in the executor itself). But it kinda raises "realtime" questions about guaranteed execution times.

Better async stack introspection needed?
It is somewhat tricky to get better introspection information on the blocking handler, mainly because of the scattered stack saves and the way we register custom handlers through anonymous in-between functions. The scattered stack is not a big deal, it could be collected by a sync descent into the callstack, still the information is somewhat useless without proper marking of the handler function itself. If we want better introspection here, we would have to store more information along the registering path.

I dont see both aspects as showstoppers of the actual PR and suggest to implement them in later PRs if we encounter issues around blocking async handlers and need more failstate handling to overcome them.

Tyriar · 2021-03-15T23:06:49Z

Also if you have an idea how to apply a more restrictive access pattern like friend in C++ I am all ears.

I normally stick "friend classes" in the same file and don't export them. Closest I've come up with for TS.

Tyriar · 2021-03-15T23:11:25Z

Do we need a second timeout with a hard rejection?

The warning is good, recovering is probably a good idea too so the terminal doesn't just break because of some addon.

Better async stack introspection needed?

👍 to defer, not sure we need this but I haven't tried to implement an async handler.

Tyriar

Lgtm 🚀

Tyriar · 2021-03-15T23:19:55Z

+   * Note: Never call this directly for a running terminal instance in production.
+   * Always use `Terminal.write`, which provides in-band blocking and correct exection order.
+   */
+  public parse(data: string | Uint8Array, promiseResult?: boolean): void | Promise<boolean> {


Maybe I didn't explain enough. I think the above would give a compile error if parse(data, false) was called, and it would return Promise<boolean> when parse(data, true) is called (instead of void | Promise<boolean>). Just a little thing that could potentially catch bugs by indicating what is valid.

…_handlers

…ogLevel

jerch · 2021-03-17T21:56:17Z

@Tyriar I pressed the merge button 😅. Since you mentioned above the explicit usage of writeSync in vscode - plz drop me a note if you encounter issues with a promise based replacement there or need something faster than simple "everything is handled by promise turnover".

early version of async support for CSI and ESC handlers

bc10480

jerch self-assigned this Jan 20, 2021

jerch added 2 commits January 22, 2021 17:49

Merge branch 'master' into async_handlers

0ad9f70

fix tests for async handlers in inputhandler

4a1bdad

deprecate writeSync

adfeffe

jerch added 4 commits January 22, 2021 21:24

make linter happy

22d777c

fix timeout issue on macos

b07ca8b

simplify stack save in parser

2f07ed9

Merge branch 'master' into async_handlers

4ba7aef

jerch added 8 commits January 24, 2021 09:11

Merge branch 'master' into async_handlers

a9eb1de

async parser tests for CSI and ESC

68b7f88

Merge branch 'master' into async_handlers

898a18d

async impl for DCS, DcsParser tests

d59f4d2

OSC impl, OscParser tests

041e259

update parser hooks api and tests

d0b5556

update public API

5db479c

minor tweaks, API docs update

2650211

jerch mentioned this pull request Feb 20, 2021

Support 1337 escape sequence to show inline images #614

Closed

jerch added 6 commits February 25, 2021 15:41

async OSC/DCS registering/dispose tests on parser instance

bce3269

fix linter warning

9985ae2

inputhandler tests

d500b0f

async handler API tests

6b57921

make reset async aware

b4635ed

Merge branch 'master' into async_handlers

26bf1ab

jerch marked this pull request as ready for review February 25, 2021 18:18

Tyriar requested changes Feb 26, 2021

View reviewed changes

jerch added 2 commits February 27, 2021 20:18

multiple changes:

cc005a2

- reorder exception throwing to be sync to band position - document WriteBuffer._innerWrite - remove any declarations - remove conditional assigments - use faster return value branching in all parsers - remove dead code in benchmark

log a warning if an async handler takes too long

48a93c3

do not timeout on faulty async handlers, continue with false to give …

cdd238d

…default handlers chance to run

Merge branch 'master' into async_handlers

c19cc9b

Tyriar added this to the 4.12.0 milestone Mar 15, 2021

Tyriar approved these changes Mar 15, 2021

View reviewed changes

jerch added 4 commits March 17, 2021 22:08

Merge branch 'master' into async_handlers

a748845

Merge branch 'async_handlers' of github.com:jerch/xterm.js into async…

76fdf9d

…_handlers

interface for inputhandler parse stack, proper typing of LogService.l…

8e7cdd2

…ogLevel

interface for subparser stack saves

869c11f

jerch mentioned this pull request Mar 17, 2021

update parser hooks description regarding async handlers xtermjs/xtermjs.org#136

Closed

jerch merged commit 145b487 into xtermjs:master Mar 17, 2021

Tyriar mentioned this pull request Mar 22, 2021

Escape in terminal causes stack overflow microsoft/vscode#119472

Closed

	public writeP(data: string \| Uint8Array): Promise<void> {
	return new Promise(r => this.write(data, r));
	}

	public async parseP(data: string \| Uint8Array): Promise<void> {
	let result: Promise<boolean> \| void;
	let prev: boolean \| undefined;
	while (result = this.parse(data, prev)) {
	prev = await result;
	}
	}

Conversation

jerch commented Jan 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerch commented Jan 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerch commented Jan 22, 2021

Uh oh!

jerch commented Jan 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerch commented Jan 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerch commented Feb 25, 2021

Uh oh!

Tyriar left a comment

Choose a reason for hiding this comment

Uh oh!

Tyriar Feb 26, 2021

Choose a reason for hiding this comment

Uh oh!

jerch Feb 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Tyriar Feb 26, 2021

Choose a reason for hiding this comment

Uh oh!

jerch Feb 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Tyriar Mar 15, 2021

Choose a reason for hiding this comment

Uh oh!

jerch Mar 17, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jerch commented Feb 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerch commented Feb 28, 2021

Uh oh!

Tyriar commented Mar 15, 2021

Uh oh!

Tyriar commented Mar 15, 2021

Uh oh!

Tyriar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Tyriar Mar 15, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jerch commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jerch commented Jan 20, 2021 •

edited

Loading

jerch commented Jan 20, 2021 •

edited

Loading

jerch commented Jan 22, 2021 •

edited

Loading

jerch commented Jan 23, 2021 •

edited

Loading

jerch Feb 27, 2021 •

edited

Loading

jerch Feb 27, 2021 •

edited

Loading

jerch commented Feb 27, 2021 •

edited

Loading

jerch commented Mar 17, 2021 •

edited

Loading