-
Notifications
You must be signed in to change notification settings - Fork 11
add blog posts about js backend debugging, stacks and weak refs #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the changes are in the debugging post. Other than that just some minor comments on explaining examples and structure of the writing.
|
||
This blog post is an experience report that presents a couple of practical techniques for debugging various problems in the JavaScript code. | ||
|
||
## Tracing Operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the introduction you state: ... presents a couple of practical techniques ...
, perhaps each technique should be labelled as such in the section header:
## Tracing Operations | |
## Technique 1: Tracing Operations |
|
||
## Tracing Operations | ||
|
||
Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
example? Maybe show some of the CPP'd code where the debug option lives
|
||
Various components of the RTS have tracing options enabled by preprocessor definitions. For example weak reference operations can be traced by compiling the `rts` package with the `-DGHCJS_TRACE_WEAK` cpp option. | ||
|
||
Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph should be the last one in the section because it is no longer talking about the technique, rather it is talking about the usability of the technique and then concludes that making this technique more usable is on our roadmap.
So the flow of the technique sections should be:
- What the technique is
- What information it provides
- How to use the technique
- Then future plans or issues with what we currently ship, i.e., the part where we say "right now this is hard because you have to rebuild with blah blah, but in the future we'll expose a flag"
|
||
Currently, enabling the trace functionality requires rebuilding the `rts` package, while previously with GHCJS it was possible to enable the required tracing by just recompiling the final program. We will likely change this setup to include all tracing functionality in a debug rts liked when using the `-debug` flag, and easily modifyable global settings to enable or disable specific tracing modules. | ||
|
||
All the tracing uses the `h$log` function which can be easily modified to redirect the output of the trace, for example tracing only to an array (which can be watched by the JavaScript debugger) and keeping only the last `n` entries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the key part. As a reader looking to debug my JS backend code this is the part I'm most interested in. Thus you should add the examples that you elude to. That is add an example that demonstrates easily modified to redirect the output of the trace
. This would be a lot of value added for the audience.
} | ||
``` | ||
|
||
The main loop keeps calling the funtion returned by the previous call, until the thread has to stop for some reason. This means that the JavaScript call stack isn't very useful for figuring out where something goes wrong in our code: It only contains function calls up to the main loop. If some `c` fails, we don't know much about what calls lead up to the error condition! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that the JavaScript call stack isn't very useful for figuring out where something goes wrong in our code
I would start with this because it is the place the audience is at and a thing the audience will probably assume. So something like this:
- Unfortunately the call stack is not that useful...
- The reason is that the main loop of the RTS is ...<the example with all the c()...
- explain the example: the main loop keeps calling until...
- tie the example back into (1):
If some
cfails, we don't know much about what calls lead up to the error condition!
```javascript | ||
// suspending a thread t | ||
t.stack = h$stack; | ||
t.sp = h$sp; | ||
h$currentThread = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain this example
h$currentThread = null; | ||
``` | ||
|
||
## Stack Frames |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha! Perhaps link or reference this earlier at the place I made a comment about stack frames
|
||
Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values. | ||
|
||
The header serves as the "return point": When some code is done reducing some value to weak-head normal form it returns this value to the next stack frame by storing it in `h$r1` (or more for large values or unboxed tuples), popping its own stack frame and calling the header of the next stack frame at `h$stack[h$sp]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain that h$r1
is a register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write something like this to introduce the topic first:
GHC's calling convention for functions generated from STG is to always perform tail-calls, where the tail-call target is a continuation.
In practice, values "returned" by a function are in fact passed as arguments to its continuation.
When the continuation isn't statically known, it is passed via the stack, similarly to C's calling convention where return addresses are passed into the C stack.
In details what happens in this case is:
- "returned values" are stored into global variables corresponding to registers (
h$r1
...) - the current function pops its own stack frame from the stack (if any?)
- remember that the header of a stack frame is directly a JavaScript function: the entry code of the stack frame. The current function should call this function to call the continuation.
- BUT remember that tail-calls aren't supported by JavaScript, hence what happens is that the current function returns the continuation to the scheduler instead of calling it directly, avoiding ever-growing call stacks. The scheduler then calls it (this method of implementing tail-calls is called "trompolining").
Here is an annotated code example of this process: ...
function h$stackFrame_e() { | ||
... | ||
h$r1 = somethingWHNF; | ||
h$sp -= 3; // pop current frame | ||
return h$stack[h$sp]; // return to next frame | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explain the example in a paragraph that immediately follows the example
|
||
Almost all stack frames have their size stored in the `size` property of the header. An exception is the `h$ap_gen` frame, which contains an arbitrary size function application. This frame type does not have a fixed size, and the size is stored in the payload of the frame itself. Frames `f` with the size stored in they payload of the frame have `f.size < 0`. | ||
|
||
## Exception Handling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this section was very good. The examples need to be explained more but other than that I thought it was very nice and clear. Good job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good writing Luite, thanks! I've left a few suggestions in addition to Jeff's ones.
|
||
Logging main loop calls generates a lot of output, even more so than tracing specific RTS features, so it's probably necessary to redirect and/or truncate the output of `h$log` here. | ||
|
||
It's often useful to make the `haveToYield` condition deterministic, by not taking wall clock time into account. This runs each thread until it blocks or finishes (`c === h$reschedule`). That makes runs reproducible, even if more than one Haskell thread is involved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps show how to change haveToYield
to achieve this? Is there a CPP flag?
, finalizer: null or heap object | ||
} | ||
``` | ||
This way the `h$Weak` does not reference they key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way the `h$Weak` does not reference they key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore. | |
This way the `h$Weak` does not reference the key itself. It still knows when the key is unreachable, since the mark of the `h$StableName` of the key would not be updated anymore. |
} | ||
``` | ||
|
||
Now we can replace a `number` mark by an `h$StableName` for the key, and then create the weak reference as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should provide a link to the documentation of StableName and Weak.
Also you should mention that:
- StableNames have the crucial property here of not keeping objects they refer to alive.
- in the JS implementation,
makeStableName
is guaranteed to return the same StableName for a given heap object: heap objects have an link to their associated StableName (if any). In addition, during GC traversals, if an heap object is marked as reachable, its associated StableName (if any) is marked as reachable too.
Hence StableNames are a perfect proxy to know if a heap object is reachable without keeping the actual object alive. This is exactly what we need for the key of Weak.
|
||
## Haskell Lightweight Stacks | ||
|
||
In the context of a program produced by the GHC JavaScript backend, two different types of stack exist: The JavaScript call stack and Haskell lightweigt thread stacks. This blog post deals with the latter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I've certainly been guilty of using former/latter too.
|
||
## Stack Frames | ||
|
||
Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps mention that in JavaScript functions can have properties and that we use this feature to indicate the number of stack slots for the frame payload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see that you mention this later. I would put this her to first explain the structure of the frame and then how we use it.
|
||
Each stack frame starts with a header, which is a JavaScript function. The header is followed by zero or more slots of payload, which can be arbitrary JavaScript values. | ||
|
||
The header serves as the "return point": When some code is done reducing some value to weak-head normal form it returns this value to the next stack frame by storing it in `h$r1` (or more for large values or unboxed tuples), popping its own stack frame and calling the header of the next stack frame at `h$stack[h$sp]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write something like this to introduce the topic first:
GHC's calling convention for functions generated from STG is to always perform tail-calls, where the tail-call target is a continuation.
In practice, values "returned" by a function are in fact passed as arguments to its continuation.
When the continuation isn't statically known, it is passed via the stack, similarly to C's calling convention where return addresses are passed into the C stack.
In details what happens in this case is:
- "returned values" are stored into global variables corresponding to registers (
h$r1
...) - the current function pops its own stack frame from the stack (if any?)
- remember that the header of a stack frame is directly a JavaScript function: the entry code of the stack frame. The current function should call this function to call the continuation.
- BUT remember that tail-calls aren't supported by JavaScript, hence what happens is that the current function returns the continuation to the scheduler instead of calling it directly, avoiding ever-growing call stacks. The scheduler then calls it (this method of implementing tail-calls is called "trompolining").
Here is an annotated code example of this process: ...
|
||
## Conclusion | ||
|
||
We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have that stacks in the JavaScript backend are represented by JavaScript arrays. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler. | |
We have seen that stacks of Haskell lightweight threads are represented by JavaScript arrays with the JavaScript backend. The contents on the stack consists of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception can traverse the stack and transfer control to an exception handler. |
Getting this off my plate: Blog posts about js backend debugging, stacks and weak references. Reviews/comments welcome.
(there's one more about compacting/sinking, but I still need to write a conclusion and some example of that one)