Skip to content

Commit f253d3b

Browse files
committed
Implement interrupting wasm code, reimplement stack overflow
This commit is a relatively large change for wasmtime with two main goals: * Primarily this enables interrupting executing wasm code with a trap, preventing infinite loops in wasm code. Note that resumption of the wasm code is not a goal of this commit. * Additionally this commit reimplements how we handle stack overflow to ensure that host functions always have a reasonable amount of stack to run on. This fixes an issue where we might longjmp out of a host function, skipping destructors. Lots of various odds and ends end up falling out in this commit once the two goals above were implemented. The strategy for implementing this was also lifted from Spidermonkey and existing functionality inside of Cranelift. I've tried to write up thorough documentation of how this all works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are. A brief summary of how this works is that each function and each loop header now checks to see if they're interrupted. Interrupts and the stack overflow check are actually folded into one now, where function headers check to see if they've run out of stack and the sentinel value used to indicate an interrupt, checked in loop headers, tricks functions into thinking they're out of stack. An interrupt is basically just writing a value to a location which is read by JIT code. When interrupts are delivered and what triggers them has been left up to embedders of the `wasmtime` crate. The `wasmtime::Store` type has a method to acquire an `InterruptHandle`, where `InterruptHandle` is a `Send` and `Sync` type which can travel to other threads (or perhaps even a signal handler) to get notified from. It's intended that this provides a good degree of flexibility when interrupting wasm code. Note though that this does have a large caveat where interrupts don't work when you're interrupting host code, so if you've got a host import blocking for a long time an interrupt won't actually be received until the wasm starts running again. Some fallout included from this change is: * Unix signal handlers are no longer registered with `SA_ONSTACK`. Instead they run on the native stack the thread was already using. This is possible since stack overflow isn't handled by hitting the guard page, but rather it's explicitly checked for in wasm now. Native stack overflow will continue to abort the process as usual. * Unix sigaltstack management is now no longer necessary since we don't use it any more. * Windows no longer has any need to reset guard pages since we no longer try to recover from faults on guard pages. * On all targets probestack intrinsics are disabled since we use a different mechanism for catching stack overflow. * The C API has been updated with interrupts handles. An example has also been added which shows off how to interrupt a module. Closes #139 Closes #860 Closes #900
1 parent a88e26c commit f253d3b

File tree

33 files changed

+1142
-256
lines changed

33 files changed

+1142
-256
lines changed

cranelift/codegen/src/ir/function.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
//! instructions.
55
66
use crate::binemit::CodeOffset;
7+
use crate::cursor::EncCursor;
78
use crate::entity::{PrimaryMap, SecondaryMap};
89
use crate::ir;
910
use crate::ir::{
@@ -96,6 +97,18 @@ pub struct Function {
9697
/// are saved in the frame. This information is created during the prologue and epilogue
9798
/// passes.
9899
pub frame_layout: Option<FrameLayout>,
100+
101+
/// An optional closure which calculates the stack limit for this function
102+
/// from the arguments of the function.
103+
///
104+
/// When configured a stack check will be emitted in the prologue of this
105+
/// function, trapping if the stack check fails. This closure, if specified,
106+
/// can be then used to infer the stack limit from provided arguments as an
107+
/// alternative to using `ArgumentPurpose::StackLimit`.
108+
///
109+
/// The first argument is the cursor we're encoding the prologue too, and
110+
/// the second argument is a scratch register location if necessary.
111+
pub stack_limit_from_arguments: Option<fn(&mut EncCursor, ir::ValueLoc) -> ir::Value>,
99112
}
100113

101114
impl Function {
@@ -120,6 +133,7 @@ impl Function {
120133
srclocs: SecondaryMap::new(),
121134
prologue_end: None,
122135
frame_layout: None,
136+
stack_limit_from_arguments: None,
123137
}
124138
}
125139

@@ -141,6 +155,7 @@ impl Function {
141155
self.srclocs.clear();
142156
self.prologue_end = None;
143157
self.frame_layout = None;
158+
self.stack_limit_from_arguments = None;
144159
}
145160

146161
/// Create a new empty, anonymous function with a Fast calling convention.

cranelift/codegen/src/isa/x86/abi.rs

Lines changed: 55 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -641,17 +641,33 @@ fn insert_common_prologue(
641641
isa: &dyn TargetIsa,
642642
) -> Option<CFAState> {
643643
let word_size = isa.pointer_bytes() as isize;
644-
if stack_size > 0 {
645-
// Check if there is a special stack limit parameter. If so insert stack check.
646-
if let Some(stack_limit_arg) = pos.func.special_param(ArgumentPurpose::StackLimit) {
647-
// Total stack size is the size of all stack area used by the function, including
648-
// pushed CSRs, frame pointer.
649-
// Also, the size of a return address, implicitly pushed by a x86 `call` instruction,
650-
// also should be accounted for.
651-
// TODO: Check if the function body actually contains a `call` instruction.
652-
let total_stack_size = (csrs.iter(GPR).len() + 1 + 1) as i64 * word_size as i64;
653-
654-
insert_stack_check(pos, total_stack_size, stack_limit_arg);
644+
645+
// If this is a leaf function with zero stack, then there's no need to
646+
// insert a stack check since it can't overflow anything and
647+
// forward-progress is guarantee so long as loop are handled anyway.
648+
//
649+
// If this has a stack size it could stack overflow, or if it isn't a leaf
650+
// it could be part of a long call chain which we need to check anyway.
651+
//
652+
// First we look for the stack limit as a special argument to the function,
653+
// and failing that we see if a custom stack limit factory has been provided
654+
// which will be used to likely calculate the stack limit from the arguments
655+
// or perhaps constants.
656+
if stack_size > 0 || !pos.func.is_leaf() {
657+
let scratch = ir::ValueLoc::Reg(RU::rax as RegUnit);
658+
let stack_limit_arg = match pos.func.special_param(ArgumentPurpose::StackLimit) {
659+
Some(arg) => {
660+
let copy = pos.ins().copy(arg);
661+
pos.func.locations[copy] = scratch;
662+
Some(copy)
663+
}
664+
None => pos
665+
.func
666+
.stack_limit_from_arguments
667+
.map(|closure| closure(pos, scratch)),
668+
};
669+
if let Some(stack_limit_arg) = stack_limit_arg {
670+
insert_stack_check(pos, stack_size, stack_limit_arg);
655671
}
656672
}
657673

@@ -804,11 +820,36 @@ fn insert_common_prologue(
804820
fn insert_stack_check(pos: &mut EncCursor, stack_size: i64, stack_limit_arg: ir::Value) {
805821
use crate::ir::condcodes::IntCC;
806822

823+
// Our stack pointer, after subtracting `stack_size`, must not be below
824+
// `stack_limit_arg`. To do this we're going to add `stack_size` to
825+
// `stack_limit_arg` and see if the stack pointer is below that. The
826+
// `stack_size + stack_limit_arg` computation might overflow, however, due
827+
// to how stack limits may be loaded and set externally to trigger a trap.
828+
//
829+
// To handle this we'll need an extra comparison to see if the stack
830+
// pointer is already below `stack_limit_arg`. Most of the time this
831+
// isn't necessary though since the stack limit which triggers a trap is
832+
// likely a sentinel somewhere around `usize::max_value()`. In that case
833+
// only conditionally emit this pre-flight check. That way most functions
834+
// only have the one comparison, but are also guaranteed that if we add
835+
// `stack_size` to `stack_limit_arg` is won't overflow.
836+
//
837+
// This does mean that code generators which use this stack check
838+
// functionality need to ensure that values stored into the stack limit
839+
// will never overflow if this threshold is added.
840+
if stack_size >= 32 * 1024 {
841+
let cflags = pos.ins().ifcmp_sp(stack_limit_arg);
842+
pos.func.locations[cflags] = ir::ValueLoc::Reg(RU::rflags as RegUnit);
843+
pos.ins().trapif(
844+
IntCC::UnsignedGreaterThanOrEqual,
845+
cflags,
846+
ir::TrapCode::StackOverflow,
847+
);
848+
}
849+
807850
// Copy `stack_limit_arg` into a %rax and use it for calculating
808851
// a SP threshold.
809-
let stack_limit_copy = pos.ins().copy(stack_limit_arg);
810-
pos.func.locations[stack_limit_copy] = ir::ValueLoc::Reg(RU::rax as RegUnit);
811-
let sp_threshold = pos.ins().iadd_imm(stack_limit_copy, stack_size);
852+
let sp_threshold = pos.ins().iadd_imm(stack_limit_arg, stack_size);
812853
pos.func.locations[sp_threshold] = ir::ValueLoc::Reg(RU::rax as RegUnit);
813854

814855
// If the stack pointer currently reaches the SP threshold or below it then after opening

cranelift/filetests/filetests/isa/x86/prologue-epilogue.clif

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
test compile
22
set opt_level=speed_and_size
33
set is_pic
4+
set enable_probestack=false
45
target x86_64 haswell
56

67
; An empty function.
@@ -244,7 +245,7 @@ block0(v0: i64):
244245
; nextln:
245246
; nextln: block0(v0: i64 [%rdi], v4: i64 [%rbp]):
246247
; nextln: v1 = copy v0
247-
; nextln: v2 = iadd_imm v1, 16
248+
; nextln: v2 = iadd_imm v1, 176
248249
; nextln: v3 = ifcmp_sp v2
249250
; nextln: trapif uge v3, stk_ovf
250251
; nextln: x86_push v4
@@ -254,3 +255,28 @@ block0(v0: i64):
254255
; nextln: v5 = x86_pop.i64
255256
; nextln: return v5
256257
; nextln: }
258+
259+
function %big_stack_limit(i64 stack_limit) {
260+
ss0 = explicit_slot 40000
261+
block0(v0: i64):
262+
return
263+
}
264+
265+
; check: function %big_stack_limit(i64 stack_limit [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
266+
; nextln: ss0 = explicit_slot 40000, offset -40016
267+
; nextln: ss1 = incoming_arg 16, offset -16
268+
; nextln:
269+
; nextln: block0(v0: i64 [%rdi], v5: i64 [%rbp]):
270+
; nextln: v1 = copy v0
271+
; nextln: v2 = ifcmp_sp v1
272+
; nextln: trapif uge v2, stk_ovf
273+
; nextln: v3 = iadd_imm v1, 0x9c40
274+
; nextln: v4 = ifcmp_sp v3
275+
; nextln: trapif uge v4, stk_ovf
276+
; nextln: x86_push v5
277+
; nextln: copy_special %rsp -> %rbp
278+
; nextln: adjust_sp_down_imm 0x9c40
279+
; nextln: adjust_sp_up_imm 0x9c40
280+
; nextln: v6 = x86_pop.i64
281+
; nextln: return v6
282+
; nextln: }

cranelift/reader/src/run_command.rs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ use std::fmt::{Display, Formatter, Result};
1212

1313
/// A run command appearing in a test file.
1414
///
15-
/// For parsing, see [Parser::parse_run_command].
15+
/// For parsing, see
16+
/// [Parser::parse_run_command](crate::parser::Parser::parse_run_command).
1617
#[derive(PartialEq, Debug)]
1718
pub enum RunCommand {
1819
/// Invoke a function and print its result.
@@ -66,6 +67,8 @@ impl Display for Invocation {
6667

6768
/// Represent a data value. Where [Value] is an SSA reference, [DataValue] is the type + value
6869
/// that would be referred to by a [Value].
70+
///
71+
/// [Value]: cranelift_codegen::ir::Value
6972
#[allow(missing_docs)]
7073
#[derive(Clone, Debug, PartialEq)]
7174
pub enum DataValue {

crates/api/src/func.rs

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,7 @@ macro_rules! getters {
175175
// ... and then once we've passed the typechecks we can hand out our
176176
// object since our `transmute` below should be safe!
177177
let f = self.wasmtime_function();
178+
let max_wasm_stack = self.store.engine().config().max_wasm_stack;
178179
Ok(move |$($args: $args),*| -> Result<R, Trap> {
179180
unsafe {
180181
let fnptr = mem::transmute::<
@@ -187,7 +188,7 @@ macro_rules! getters {
187188
>(f.address);
188189
let mut ret = None;
189190
$(let $args = $args.into_abi();)*
190-
wasmtime_runtime::catch_traps(f.vmctx, || {
191+
wasmtime_runtime::catch_traps(f.vmctx, max_wasm_stack, || {
191192
ret = Some(fnptr(f.vmctx, ptr::null_mut(), $($args,)*));
192193
}).map_err(Trap::from_jit)?;
193194
Ok(ret.unwrap())
@@ -525,14 +526,18 @@ impl Func {
525526

526527
// Call the trampoline.
527528
if let Err(error) = unsafe {
528-
wasmtime_runtime::catch_traps(self.export.vmctx, || {
529-
(self.trampoline)(
530-
self.export.vmctx,
531-
ptr::null_mut(),
532-
self.export.address,
533-
values_vec.as_mut_ptr(),
534-
)
535-
})
529+
wasmtime_runtime::catch_traps(
530+
self.export.vmctx,
531+
self.store.engine().config().max_wasm_stack,
532+
|| {
533+
(self.trampoline)(
534+
self.export.vmctx,
535+
ptr::null_mut(),
536+
self.export.address,
537+
values_vec.as_mut_ptr(),
538+
)
539+
},
540+
)
536541
} {
537542
return Err(Trap::from_jit(error).into());
538543
}

crates/api/src/instance.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ fn instantiate(
3232
&mut resolver,
3333
sig_registry,
3434
config.memory_creator.as_ref().map(|a| a as _),
35+
config.max_wasm_stack,
3536
)
3637
.map_err(|e| -> Error {
3738
match e {

0 commit comments

Comments
 (0)