add blog posts about js backend debugging, stacks and weak refs #37

luite · 2023-02-23T08:29:27Z

Getting this off my plate: Blog posts about js backend debugging, stacks and weak references. Reviews/comments welcome.

(there's one more about compacting/sinking, but I still need to write a conclusion and some example of that one)

doyougnu

Most of the changes are in the debugging post. Other than that just some minor comments on explaining examples and structure of the writing.

doyougnu · 2023-03-02T14:25:07Z

blog/2023-02-28-debugging-the-js-backend.md

+
+This blog post is an experience report that presents a couple of practical techniques for debugging various problems in the JavaScript code.
+
+## Tracing Operations


In the introduction you state: ... presents a couple of practical techniques ..., perhaps each technique should be labelled as such in the section header:

Suggested change

## Tracing Operations

## Technique 1: Tracing Operations

doyougnu · 2023-03-02T14:28:28Z

blog/2023-02-28-debugging-the-js-backend.md

+
+## Tracing Operations
+
+Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option.


example? Maybe show some of the CPP'd code where the debug option lives

doyougnu · 2023-03-02T14:31:13Z

blog/2023-02-28-debugging-the-js-backend.md

+
+Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option.
+
+Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules.


This paragraph should be the last one in the section because it is no longer talking about the technique, rather it is talking about the usability of the technique and then concludes that making this technique more usable is on our roadmap.

So the flow of the technique sections should be:

What the technique is

What information it provides

How to use the technique

Then future plans or issues with what we currently ship, i.e., the part where we say "right now this is hard because you have to rebuild with blah blah, but in the future we'll expose a flag"

doyougnu · 2023-03-02T14:33:17Z

blog/2023-02-28-debugging-the-js-backend.md

+
+Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules.
+
+All the tracing uses the `h$log` function which can be easily modified to redirect the output of the trace, for example tracing only to an array (which can be watched by the JavaScript debugger) and keeping only the last `n` entries.


this is the key part. As a reader looking to debug my JS backend code this is the part I'm most interested in. Thus you should add the examples that you elude to. That is add an example that demonstrates easily modified to redirect the output of the trace. This would be a lot of value added for the audience.

doyougnu · 2023-03-02T14:36:30Z

blog/2023-02-28-debugging-the-js-backend.md

+    }
+```
+
+The main loop keeps calling the funtion returned by the previous call, until the thread has to stop for some reason. This means that the JavaScript call stack isn't very useful for figuring out where something goes wrong in our code: It only contains function calls up to the main loop. If some `c` fails, we don't know much about what calls lead up to the error condition!


This means that the JavaScript call stack isn't very useful for figuring out where something goes wrong in our code

I would start with this because it is the place the audience is at and a thing the audience will probably assume. So something like this:

Unfortunately the call stack is not that useful...

The reason is that the main loop of the RTS is ...<the example with all the c()...

explain the example: the main loop keeps calling until...

tie the example back into (1): If some c fails, we don't know much about what calls lead up to the error condition!

doyougnu · 2023-03-02T15:25:09Z

blog/2023-02-28-js-backend-stacks.md

+```javascript
+// suspending a thread t
+t.stack = h$stack;
+t.sp = h$sp;
+h$currentThread = null;


explain this example

doyougnu · 2023-03-02T15:25:52Z

blog/2023-02-28-js-backend-stacks.md

+h$currentThread = null;
+```
+
+## Stack Frames


aha! Perhaps link or reference this earlier at the place I made a comment about stack frames

doyougnu · 2023-03-02T15:26:54Z

blog/2023-02-28-js-backend-stacks.md

+
+Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values.
+
+The header serves as the "return point": When some code is done reducing some value to weak-head normal form it returns this value to the next stack frame by storing it in `h$r1` (or more for large values or unboxed tuples), popping its own stack frame and calling the header of the next stack frame at `h$stack[h$sp]`


explain that h$r1 is a register.

I would write something like this to introduce the topic first:

GHC's calling convention for functions generated from STG is to always perform tail-calls, where the tail-call target is a continuation.
In practice, values "returned" by a function are in fact passed as arguments to its continuation.
When the continuation isn't statically known, it is passed via the stack, similarly to C's calling convention where return addresses are passed into the C stack.

In details what happens in this case is:

"returned values" are stored into global variables corresponding to registers (h$r1...)

the current function pops its own stack frame from the stack (if any?)

remember that the header of a stack frame is directly a JavaScript function: the entry code of the stack frame. The current function should call this function to call the continuation.

BUT remember that tail-calls aren't supported by JavaScript, hence what happens is that the current function returns the continuation to the scheduler instead of calling it directly, avoiding ever-growing call stacks. The scheduler then calls it (this method of implementing tail-calls is called "trompolining").

Here is an annotated code example of this process: ...

doyougnu · 2023-03-02T15:27:40Z

blog/2023-02-28-js-backend-stacks.md

+function h$stackFrame_e() {
+  ...
+  h$r1 = somethingWHNF;
+  h$sp -= 3; // pop current frame
+  return h$stack[h$sp]; // return to next frame
+}


explain the example in a paragraph that immediately follows the example

doyougnu · 2023-03-02T15:31:22Z

blog/2023-02-28-js-backend-stacks.md

+
+Almost all stack frames have their size stored in the `size` property of the header. An exception is the `h$ap_gen` frame, which contains an arbitrary size function application. This frame type does not have a fixed size, and the size is stored in the payload of the frame itself. Frames `f` with the size stored in they payload of the frame have `f.size < 0`.
+
+## Exception Handling


this section was very good. The examples need to be explained more but other than that I thought it was very nice and clear. Good job!

hsyl20

Good writing Luite, thanks! I've left a few suggestions in addition to Jeff's ones.

hsyl20 · 2023-03-03T09:57:13Z

blog/2023-02-28-debugging-the-js-backend.md

+
+Logging main loop calls generates a lot of output, even more so than tracing specific RTS features, so it's probably necessary to redirect and/or truncate the output of `h$log` here.
+
+It's often useful to make the `haveToYield` condition deterministic, by not taking wall clock time into account. This runs each thread until it blocks or finishes (`c === h$reschedule`). That makes runs reproducible, even if more than one Haskell thread is involved.


Perhaps show how to change haveToYield to achieve this? Is there a CPP flag?

hsyl20 · 2023-03-03T10:04:03Z

blog/2023-02-28-ghcjs-weak-references.md

+,   finalizer: null or heap object
+}
+```
+This way the `h$Weak` does not reference they key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore.


Suggested change

This way the `h$Weak` does not reference they key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore.

This way the `h$Weak` does not reference the key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore.

hsyl20 · 2023-03-03T10:25:34Z

blog/2023-02-28-ghcjs-weak-references.md

+}
+```
+
+Now we can replace a `number` mark by an `h$StableName` for the key, and then create the weak reference as follows:


You should provide a link to the documentation of StableName and Weak.

Also you should mention that:

StableNames have the crucial property here of not keeping objects they refer to alive.

in the JS implementation, makeStableName is guaranteed to return the same StableName for a given heap object: heap objects have an link to their associated StableName (if any). In addition, during GC traversals, if an heap object is marked as reachable, its associated StableName (if any) is marked as reachable too.

Hence StableNames are a perfect proxy to know if a heap object is reachable without keeping the actual object alive. This is exactly what we need for the key of Weak.

hsyl20 · 2023-03-03T10:27:30Z

blog/2023-02-28-js-backend-stacks.md

+
+## Haskell Lightweight Stacks
+
+In the context of a program produced by the GHC JavaScript backend, two different types of stack exist: The JavaScript call stack and Haskell lightweigt thread stacks. This blog post deals with the latter.


Good point. I've certainly been guilty of using former/latter too.

hsyl20 · 2023-03-03T10:32:30Z

blog/2023-02-28-js-backend-stacks.md

+
+## Stack Frames
+
+Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values.


Perhaps mention that in JavaScript functions can have properties and that we use this feature to indicate the number of stack slots for the frame payload.

Ah I see that you mention this later. I would put this her to first explain the structure of the frame and then how we use it.

hsyl20 · 2023-03-03T10:59:21Z

blog/2023-02-28-js-backend-stacks.md

+
+Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values.
+
+The header serves as the "return point": When some code is done reducing some value to weak-head normal form it returns this value to the next stack frame by storing it in `h$r1` (or more for large values or unboxed tuples), popping its own stack frame and calling the header of the next stack frame at `h$stack[h$sp]`


I would write something like this to introduce the topic first:

GHC's calling convention for functions generated from STG is to always perform tail-calls, where the tail-call target is a continuation.
In practice, values "returned" by a function are in fact passed as arguments to its continuation.
When the continuation isn't statically known, it is passed via the stack, similarly to C's calling convention where return addresses are passed into the C stack.

In details what happens in this case is:

"returned values" are stored into global variables corresponding to registers (h$r1...)

the current function pops its own stack frame from the stack (if any?)

remember that the header of a stack frame is directly a JavaScript function: the entry code of the stack frame. The current function should call this function to call the continuation.

BUT remember that tail-calls aren't supported by JavaScript, hence what happens is that the current function returns the continuation to the scheduler instead of calling it directly, avoiding ever-growing call stacks. The scheduler then calls it (this method of implementing tail-calls is called "trompolining").

Here is an annotated code example of this process: ...

hsyl20 · 2023-03-03T11:04:35Z

blog/2023-02-28-js-backend-stacks.md

+
+## Conclusion
+
+We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler.


Suggested change

We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler.

We have seen that stacks of Haskell lightweight threads are represented by JavaScript arrays with the JavaScript backend. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler.

add blog posts about js backend debugging, stacks and weak refs

925803a

hsyl20 requested review from hsyl20, doyougnu and JoshMeredith February 28, 2023 13:29

doyougnu suggested changes Mar 2, 2023

View reviewed changes

hsyl20 requested changes Mar 3, 2023

View reviewed changes


		This blog post is an experience report that presents a couple of practical techniques for debugging various problems in the JavaScript code.

		## Tracing Operations


		## Tracing Operations

		Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option.


		Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option.

		Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules.


		Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules.

		All the tracing uses the `h$log` function which can be easily modified to redirect the output of the trace, for example tracing only to an array (which can be watched by the JavaScript debugger) and keeping only the last `n` entries.


		Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values.

		The header serves as the "return point": When some code is done reducing some value to weak-head normal form it returns this value to the next stack frame by storing it in `h$r1` (or more for large values or unboxed tuples), popping its own stack frame and calling the header of the next stack frame at `h$stack[h$sp]`


		Almost all stack frames have their size stored in the `size` property of the header. An exception is the `h$ap_gen` frame, which contains an arbitrary size function application. This frame type does not have a fixed size, and the size is stored in the payload of the frame itself. Frames `f` with the size stored in they payload of the frame have `f.size < 0`.

		## Exception Handling


		Logging main loop calls generates a lot of output, even more so than tracing specific RTS features, so it's probably necessary to redirect and/or truncate the output of `h$log` here.

		It's often useful to make the `haveToYield` condition deterministic, by not taking wall clock time into account. This runs each thread until it blocks or finishes (`c === h$reschedule`). That makes runs reproducible, even if more than one Haskell thread is involved.

	This way the `h$Weak` does not reference they key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore.
	This way the `h$Weak` does not reference the key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore.


		## Haskell Lightweight Stacks

		In the context of a program produced by the GHC JavaScript backend, two different types of stack exist: The JavaScript call stack and Haskell lightweigt thread stacks. This blog post deals with the latter.


		## Stack Frames

		Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values.


		## Conclusion

		We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler.

	We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler.
	We have seen that stacks of Haskell lightweight threads are represented by JavaScript arrays with the JavaScript backend. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler.

add blog posts about js backend debugging, stacks and weak refs #37

Are you sure you want to change the base?

add blog posts about js backend debugging, stacks and weak refs #37

Uh oh!

Conversation

luite commented Feb 23, 2023

Uh oh!

doyougnu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hsyl20 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!