Skip to content

Commit c9a0ba8

Browse files
authored
Implement interrupting wasm code, reimplement stack overflow (#1490)
* Implement interrupting wasm code, reimplement stack overflow This commit is a relatively large change for wasmtime with two main goals: * Primarily this enables interrupting executing wasm code with a trap, preventing infinite loops in wasm code. Note that resumption of the wasm code is not a goal of this commit. * Additionally this commit reimplements how we handle stack overflow to ensure that host functions always have a reasonable amount of stack to run on. This fixes an issue where we might longjmp out of a host function, skipping destructors. Lots of various odds and ends end up falling out in this commit once the two goals above were implemented. The strategy for implementing this was also lifted from Spidermonkey and existing functionality inside of Cranelift. I've tried to write up thorough documentation of how this all works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are. A brief summary of how this works is that each function and each loop header now checks to see if they're interrupted. Interrupts and the stack overflow check are actually folded into one now, where function headers check to see if they've run out of stack and the sentinel value used to indicate an interrupt, checked in loop headers, tricks functions into thinking they're out of stack. An interrupt is basically just writing a value to a location which is read by JIT code. When interrupts are delivered and what triggers them has been left up to embedders of the `wasmtime` crate. The `wasmtime::Store` type has a method to acquire an `InterruptHandle`, where `InterruptHandle` is a `Send` and `Sync` type which can travel to other threads (or perhaps even a signal handler) to get notified from. It's intended that this provides a good degree of flexibility when interrupting wasm code. Note though that this does have a large caveat where interrupts don't work when you're interrupting host code, so if you've got a host import blocking for a long time an interrupt won't actually be received until the wasm starts running again. Some fallout included from this change is: * Unix signal handlers are no longer registered with `SA_ONSTACK`. Instead they run on the native stack the thread was already using. This is possible since stack overflow isn't handled by hitting the guard page, but rather it's explicitly checked for in wasm now. Native stack overflow will continue to abort the process as usual. * Unix sigaltstack management is now no longer necessary since we don't use it any more. * Windows no longer has any need to reset guard pages since we no longer try to recover from faults on guard pages. * On all targets probestack intrinsics are disabled since we use a different mechanism for catching stack overflow. * The C API has been updated with interrupts handles. An example has also been added which shows off how to interrupt a module. Closes #139 Closes #860 Closes #900 * Update comment about magical interrupt value * Store stack limit as a global value, not a closure * Run rustfmt * Handle review comments * Add a comment about SA_ONSTACK * Use `usize` for type of `INTERRUPTED` * Parse human-readable durations * Bring back sigaltstack handling Allows libstd to print out stack overflow on failure still. * Add parsing and emission of stack limit-via-preamble * Fix new example for new apis * Fix host segfault test in release mode * Fix new doc example
1 parent 4a63a4d commit c9a0ba8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+1361
-143
lines changed

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ file-per-thread-logger = "0.1.1"
3939
wat = "1.0.10"
4040
libc = "0.2.60"
4141
rayon = "1.2.1"
42+
humantime = "1.3.0"
4243

4344
[dev-dependencies]
4445
filecheck = "0.5.0"

cranelift/codegen/src/ir/entities.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -392,6 +392,8 @@ pub enum AnyEntity {
392392
Heap(Heap),
393393
/// A table.
394394
Table(Table),
395+
/// A function's stack limit
396+
StackLimit,
395397
}
396398

397399
impl fmt::Display for AnyEntity {
@@ -409,6 +411,7 @@ impl fmt::Display for AnyEntity {
409411
Self::SigRef(r) => r.fmt(f),
410412
Self::Heap(r) => r.fmt(f),
411413
Self::Table(r) => r.fmt(f),
414+
Self::StackLimit => write!(f, "stack_limit"),
412415
}
413416
}
414417
}

cranelift/codegen/src/ir/function.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,13 @@ pub struct Function {
9595
///
9696
/// This is used for some ABIs to generate unwind information.
9797
pub epilogues_start: Vec<Inst>,
98+
99+
/// An optional global value which represents an expression evaluating to
100+
/// the stack limit for this function. This `GlobalValue` will be
101+
/// interpreted in the prologue, if necessary, to insert a stack check to
102+
/// ensure that a trap happens if the stack pointer goes below the
103+
/// threshold specified here.
104+
pub stack_limit: Option<ir::GlobalValue>,
98105
}
99106

100107
impl Function {
@@ -119,6 +126,7 @@ impl Function {
119126
srclocs: SecondaryMap::new(),
120127
prologue_end: None,
121128
epilogues_start: Vec::new(),
129+
stack_limit: None,
122130
}
123131
}
124132

@@ -140,6 +148,7 @@ impl Function {
140148
self.srclocs.clear();
141149
self.prologue_end = None;
142150
self.epilogues_start.clear();
151+
self.stack_limit = None;
143152
}
144153

145154
/// Create a new empty, anonymous function with a Fast calling convention.

cranelift/codegen/src/isa/x86/abi.rs

Lines changed: 89 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -685,21 +685,32 @@ fn insert_common_prologue(
685685
fpr_slot: Option<&StackSlot>,
686686
isa: &dyn TargetIsa,
687687
) {
688-
if stack_size > 0 {
689-
// Check if there is a special stack limit parameter. If so insert stack check.
690-
if let Some(stack_limit_arg) = pos.func.special_param(ArgumentPurpose::StackLimit) {
691-
// Total stack size is the size of all stack area used by the function, including
692-
// pushed CSRs, frame pointer.
693-
// Also, the size of a return address, implicitly pushed by a x86 `call` instruction,
694-
// also should be accounted for.
695-
// If any FPR are present, count them as well as necessary alignment space.
696-
// TODO: Check if the function body actually contains a `call` instruction.
697-
let mut total_stack_size =
698-
(csrs.iter(GPR).len() + 1 + 1) as i64 * (isa.pointer_bytes() as isize) as i64;
699-
700-
total_stack_size += csrs.iter(FPR).len() as i64 * types::F64X2.bytes() as i64;
701-
702-
insert_stack_check(pos, total_stack_size, stack_limit_arg);
688+
// If this is a leaf function with zero stack, then there's no need to
689+
// insert a stack check since it can't overflow anything and
690+
// forward-progress is guarantee so long as loop are handled anyway.
691+
//
692+
// If this has a stack size it could stack overflow, or if it isn't a leaf
693+
// it could be part of a long call chain which we need to check anyway.
694+
//
695+
// First we look for the stack limit as a special argument to the function,
696+
// and failing that we see if a custom stack limit factory has been provided
697+
// which will be used to likely calculate the stack limit from the arguments
698+
// or perhaps constants.
699+
if stack_size > 0 || !pos.func.is_leaf() {
700+
let scratch = ir::ValueLoc::Reg(RU::rax as RegUnit);
701+
let stack_limit_arg = match pos.func.special_param(ArgumentPurpose::StackLimit) {
702+
Some(arg) => {
703+
let copy = pos.ins().copy(arg);
704+
pos.func.locations[copy] = scratch;
705+
Some(copy)
706+
}
707+
None => pos
708+
.func
709+
.stack_limit
710+
.map(|gv| interpret_gv(pos, gv, scratch)),
711+
};
712+
if let Some(stack_limit_arg) = stack_limit_arg {
713+
insert_stack_check(pos, stack_size, stack_limit_arg);
703714
}
704715
}
705716

@@ -811,16 +822,76 @@ fn insert_common_prologue(
811822
);
812823
}
813824

825+
/// Inserts code necessary to calculate `gv`.
826+
///
827+
/// Note that this is typically done with `ins().global_value(...)` but that
828+
/// requires legalization to run to encode it, and we're running super late
829+
/// here in the backend where legalization isn't possible. To get around this
830+
/// we manually interpret the `gv` specified and do register allocation for
831+
/// intermediate values.
832+
///
833+
/// This is an incomplete implementation of loading `GlobalValue` values to get
834+
/// compared to the stack pointer, but currently it serves enough functionality
835+
/// to get this implemented in `wasmtime` itself. This'll likely get expanded a
836+
/// bit over time!
837+
fn interpret_gv(pos: &mut EncCursor, gv: ir::GlobalValue, scratch: ir::ValueLoc) -> ir::Value {
838+
match pos.func.global_values[gv] {
839+
ir::GlobalValueData::VMContext => pos
840+
.func
841+
.special_param(ir::ArgumentPurpose::VMContext)
842+
.expect("no vmcontext parameter found"),
843+
ir::GlobalValueData::Load {
844+
base,
845+
offset,
846+
global_type,
847+
readonly: _,
848+
} => {
849+
let base = interpret_gv(pos, base, scratch);
850+
let ret = pos
851+
.ins()
852+
.load(global_type, ir::MemFlags::trusted(), base, offset);
853+
pos.func.locations[ret] = scratch;
854+
return ret;
855+
}
856+
ref other => panic!("global value for stack limit not supported: {}", other),
857+
}
858+
}
859+
814860
/// Insert a check that generates a trap if the stack pointer goes
815861
/// below a value in `stack_limit_arg`.
816862
fn insert_stack_check(pos: &mut EncCursor, stack_size: i64, stack_limit_arg: ir::Value) {
817863
use crate::ir::condcodes::IntCC;
818864

865+
// Our stack pointer, after subtracting `stack_size`, must not be below
866+
// `stack_limit_arg`. To do this we're going to add `stack_size` to
867+
// `stack_limit_arg` and see if the stack pointer is below that. The
868+
// `stack_size + stack_limit_arg` computation might overflow, however, due
869+
// to how stack limits may be loaded and set externally to trigger a trap.
870+
//
871+
// To handle this we'll need an extra comparison to see if the stack
872+
// pointer is already below `stack_limit_arg`. Most of the time this
873+
// isn't necessary though since the stack limit which triggers a trap is
874+
// likely a sentinel somewhere around `usize::max_value()`. In that case
875+
// only conditionally emit this pre-flight check. That way most functions
876+
// only have the one comparison, but are also guaranteed that if we add
877+
// `stack_size` to `stack_limit_arg` is won't overflow.
878+
//
879+
// This does mean that code generators which use this stack check
880+
// functionality need to ensure that values stored into the stack limit
881+
// will never overflow if this threshold is added.
882+
if stack_size >= 32 * 1024 {
883+
let cflags = pos.ins().ifcmp_sp(stack_limit_arg);
884+
pos.func.locations[cflags] = ir::ValueLoc::Reg(RU::rflags as RegUnit);
885+
pos.ins().trapif(
886+
IntCC::UnsignedGreaterThanOrEqual,
887+
cflags,
888+
ir::TrapCode::StackOverflow,
889+
);
890+
}
891+
819892
// Copy `stack_limit_arg` into a %rax and use it for calculating
820893
// a SP threshold.
821-
let stack_limit_copy = pos.ins().copy(stack_limit_arg);
822-
pos.func.locations[stack_limit_copy] = ir::ValueLoc::Reg(RU::rax as RegUnit);
823-
let sp_threshold = pos.ins().iadd_imm(stack_limit_copy, stack_size);
894+
let sp_threshold = pos.ins().iadd_imm(stack_limit_arg, stack_size);
824895
pos.func.locations[sp_threshold] = ir::ValueLoc::Reg(RU::rax as RegUnit);
825896

826897
// If the stack pointer currently reaches the SP threshold or below it then after opening

cranelift/codegen/src/write.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,11 @@ pub trait FuncWriter {
107107
self.write_entity_definition(w, func, cref.into(), cval)?;
108108
}
109109

110+
if let Some(limit) = func.stack_limit {
111+
any = true;
112+
self.write_entity_definition(w, func, AnyEntity::StackLimit, &limit)?;
113+
}
114+
110115
Ok(any)
111116
}
112117

cranelift/filetests/filetests/isa/x86/prologue-epilogue.clif

Lines changed: 59 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
test compile
22
set opt_level=speed_and_size
33
set is_pic
4+
set enable_probestack=false
45
target x86_64 haswell
56

67
; An empty function.
@@ -244,7 +245,7 @@ block0(v0: i64):
244245
; nextln:
245246
; nextln: block0(v0: i64 [%rdi], v4: i64 [%rbp]):
246247
; nextln: v1 = copy v0
247-
; nextln: v2 = iadd_imm v1, 16
248+
; nextln: v2 = iadd_imm v1, 176
248249
; nextln: v3 = ifcmp_sp v2
249250
; nextln: trapif uge v3, stk_ovf
250251
; nextln: x86_push v4
@@ -254,3 +255,60 @@ block0(v0: i64):
254255
; nextln: v5 = x86_pop.i64
255256
; nextln: return v5
256257
; nextln: }
258+
259+
function %big_stack_limit(i64 stack_limit) {
260+
ss0 = explicit_slot 40000
261+
block0(v0: i64):
262+
return
263+
}
264+
265+
; check: function %big_stack_limit(i64 stack_limit [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
266+
; nextln: ss0 = explicit_slot 40000, offset -40016
267+
; nextln: ss1 = incoming_arg 16, offset -16
268+
; nextln:
269+
; nextln: block0(v0: i64 [%rdi], v5: i64 [%rbp]):
270+
; nextln: v1 = copy v0
271+
; nextln: v2 = ifcmp_sp v1
272+
; nextln: trapif uge v2, stk_ovf
273+
; nextln: v3 = iadd_imm v1, 0x9c40
274+
; nextln: v4 = ifcmp_sp v3
275+
; nextln: trapif uge v4, stk_ovf
276+
; nextln: x86_push v5
277+
; nextln: copy_special %rsp -> %rbp
278+
; nextln: adjust_sp_down_imm 0x9c40
279+
; nextln: adjust_sp_up_imm 0x9c40
280+
; nextln: v6 = x86_pop.i64
281+
; nextln: return v6
282+
; nextln: }
283+
284+
function %limit_preamble(i64 vmctx) {
285+
gv0 = vmctx
286+
gv1 = load.i64 notrap aligned gv0
287+
gv2 = load.i64 notrap aligned gv1+4
288+
stack_limit = gv2
289+
ss0 = explicit_slot 20
290+
block0(v0: i64):
291+
return
292+
}
293+
294+
; check: function %limit_preamble(i64 vmctx [%rdi], i64 fp [%rbp]) -> i64 fp [%rbp] fast {
295+
; nextln: ss0 = explicit_slot 20, offset -36
296+
; nextln: ss1 = incoming_arg 16, offset -16
297+
; nextln: gv0 = vmctx
298+
; nextln: gv1 = load.i64 notrap aligned gv0
299+
; nextln: gv2 = load.i64 notrap aligned gv1+4
300+
; nextln: stack_limit = gv2
301+
; nextln:
302+
; nextln: block0(v0: i64 [%rdi], v5: i64 [%rbp]):
303+
; nextln: v1 = load.i64 notrap aligned v0
304+
; nextln: v2 = load.i64 notrap aligned v1+4
305+
; nextln: v3 = iadd_imm v2, 32
306+
; nextln: v4 = ifcmp_sp v3
307+
; nextln: trapif uge v4, stk_ovf
308+
; nextln: x86_push v5
309+
; nextln: copy_special %rsp -> %rbp
310+
; nextln: adjust_sp_down_imm 32
311+
; nextln: adjust_sp_up_imm 32
312+
; nextln: v6 = x86_pop.i64
313+
; nextln: return v6
314+
; nextln: }

cranelift/reader/src/parser.rs

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,15 @@ impl<'a> Context<'a> {
358358
Ok(())
359359
}
360360

361+
// Configure the stack limit of the current function.
362+
fn add_stack_limit(&mut self, limit: GlobalValue, loc: Location) -> ParseResult<()> {
363+
if self.function.stack_limit.is_some() {
364+
return err!(loc, "stack limit defined twice");
365+
}
366+
self.function.stack_limit = Some(limit);
367+
Ok(())
368+
}
369+
361370
// Resolve a reference to a constant.
362371
fn check_constant(&self, c: Constant, loc: Location) -> ParseResult<()> {
363372
if !self.map.contains_constant(c) {
@@ -598,6 +607,15 @@ impl<'a> Parser<'a> {
598607
err!(self.loc, "expected constant number: const«n»")
599608
}
600609

610+
// Match and consume a stack limit token
611+
fn match_stack_limit(&mut self) -> ParseResult<()> {
612+
if let Some(Token::Identifier("stack_limit")) = self.token() {
613+
self.consume();
614+
return Ok(());
615+
}
616+
err!(self.loc, "expected identifier: stack_limit")
617+
}
618+
601619
// Match and consume a block reference.
602620
fn match_block(&mut self, err_msg: &str) -> ParseResult<Block> {
603621
if let Some(Token::Block(block)) = self.token() {
@@ -1455,6 +1473,7 @@ impl<'a> Parser<'a> {
14551473
// * function-decl
14561474
// * signature-decl
14571475
// * jump-table-decl
1476+
// * stack-limit-decl
14581477
//
14591478
// The parsed decls are added to `ctx` rather than returned.
14601479
fn parse_preamble(&mut self, ctx: &mut Context) -> ParseResult<()> {
@@ -1503,6 +1522,11 @@ impl<'a> Parser<'a> {
15031522
self.parse_constant_decl()
15041523
.and_then(|(c, v)| ctx.add_constant(c, v, self.loc))
15051524
}
1525+
Some(Token::Identifier("stack_limit")) => {
1526+
self.start_gathering_comments();
1527+
self.parse_stack_limit_decl()
1528+
.and_then(|gv| ctx.add_stack_limit(gv, self.loc))
1529+
}
15061530
// More to come..
15071531
_ => return Ok(()),
15081532
}?;
@@ -1907,6 +1931,28 @@ impl<'a> Parser<'a> {
19071931
Ok((name, data))
19081932
}
19091933

1934+
// Parse a stack limit decl
1935+
//
1936+
// stack-limit-decl ::= * StackLimit "=" GlobalValue(gv)
1937+
fn parse_stack_limit_decl(&mut self) -> ParseResult<GlobalValue> {
1938+
self.match_stack_limit()?;
1939+
self.match_token(Token::Equal, "expected '=' in stack limit decl")?;
1940+
let limit = match self.token() {
1941+
Some(Token::GlobalValue(base_num)) => match GlobalValue::with_number(base_num) {
1942+
Some(gv) => gv,
1943+
None => return err!(self.loc, "invalid global value number for stack limit"),
1944+
},
1945+
_ => return err!(self.loc, "expected global value"),
1946+
};
1947+
self.consume();
1948+
1949+
// Collect any trailing comments.
1950+
self.token();
1951+
self.claim_gathered_comments(AnyEntity::StackLimit);
1952+
1953+
Ok(limit)
1954+
}
1955+
19101956
// Parse a function body, add contents to `ctx`.
19111957
//
19121958
// function-body ::= * { extended-basic-block }

cranelift/reader/src/run_command.rs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@ use std::fmt::{Display, Formatter, Result};
1212

1313
/// A run command appearing in a test file.
1414
///
15-
/// For parsing, see [Parser::parse_run_command].
15+
/// For parsing, see
16+
/// [Parser::parse_run_command](crate::parser::Parser::parse_run_command).
1617
#[derive(PartialEq, Debug)]
1718
pub enum RunCommand {
1819
/// Invoke a function and print its result.
@@ -66,6 +67,8 @@ impl Display for Invocation {
6667

6768
/// Represent a data value. Where [Value] is an SSA reference, [DataValue] is the type + value
6869
/// that would be referred to by a [Value].
70+
///
71+
/// [Value]: cranelift_codegen::ir::Value
6972
#[allow(missing_docs)]
7073
#[derive(Clone, Debug, PartialEq)]
7174
pub enum DataValue {

0 commit comments

Comments
 (0)