You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reduce instruction dispatch by tail-call elimination (#95)
To meet the tail-call optimization requirement, we must convert the
function emulate into a recursive version (TCO). To accomplish this, we
add a variable tailcall to the struct rv_insn_t to assist us in
determining whether or not the basic block is terminated. As a result,
we can rewrite function emulate into a self-recursive function using
this variable. However, after performing performance analysis, we
discovered that the emulator required a significant amount of time to
calculate the jumping address. As a result, we stick with the wasm3
implementation, which separates all instruction emulations, and modify
struct rv_insn_t so that we can directly assign instruction emulation to
IR by adding member impl.
CoreMark results:
| Model | Compiler | f2da162 | TCO | Speedup |
|--------------+----------+---------+---------+---------|
| Core i7-8700 | clang-15 | 836.484 | 971.951 | +13.9% |
|--------------+----------+---------+---------+---------|
| Core i7-8700 | gcc-12 | 888.342 | 963.336 | +7.8% |
|--------------+----------+---------+---------+---------|
| eMAG 8180 | clang-15 | 286.000 | 335.396 | +20.5% |
|--------------+----------+---------+---------+---------|
| eMAG 8180 | gcc-12 | 259.638 | 332.561 | +14.0% |
Previously, when function "emulate" terminated, it returned to
function "block_emulate" because the previous calling sequence was
rv_step ->
block_emulate ->
emulate ->
block_emulate ->
emulate ->
...
As a result, a function stack frame was created each time function
"emulate" was invoked. In addition, the jumping address had to be
calculated using a method such as switch-case, computed-goto in
function "emulate". However, because we can now invoke instruction
emulation directly and the current calling route is
rv_step ->
instruction emulation ->
instruction emulation ->
...
The instruction emulation an now use the same function stack frame
due to TCO. That is, any instruction in a basic block can emulate a
function by using the same function stack frame, saving the overhead
of creating function stack frames.
0 commit comments