Skip to content

WASM module base address is always 0, prevents multiple modules with single imported memory #46645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Arnavion opened this issue Dec 10, 2017 · 7 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. O-wasm Target: WASM (WebAssembly), http://webassembly.org/

Comments

@Arnavion
Copy link

Arnavion commented Dec 10, 2017

Every wasm module's base address currently defaults to 0 with no way for the crate to override it. This means multiple modules compiled with #![wasm_import_memory] can't use a single imported memory without clobbering each other's data.

Does it make sense to add a crate attribute like #![wasm_base_address(100)] to set that module's base address in binaryen's options to 100 ? This does mean that the user will have to manually update every crate's base address according to the base addresses of every other crate. So this doesn't work for crates they don't own [1].

The proper solution would be for the compiler to emit relocation information so that a post-processor like wasm-gc or a loader in the host environment can relocate the modules. As of now that isn't possible, since pointers to static data are indistinguishable from integers.

[1]: Doesn't work for top-level crates (that compile into WASM modules) they don't own that is. rlib dependency crates are fine.

@alexcrichton alexcrichton added the O-wasm Target: WASM (WebAssembly), http://webassembly.org/ label Dec 11, 2017
@kennytm kennytm added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Dec 11, 2017
@alexcrichton
Copy link
Member

I believe this is now configurable with LLD as the default linker, so closing

@Arnavion
Copy link
Author

Arnavion commented Mar 6, 2018

Adding a pointer for future readers: #48125 (comment)

OVERVIEW: LLVM Linker

USAGE: ./obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/bin/lld [options] <inputs>

OPTIONS:
  --emit-relocs          Generate relocations in output
  --entry <entry>        Name of entry point symbol
  --global-base=<value>  Where to start to place global data
  --import-memory        Import memory from the environment
  --initial-memory=<value>
                         Initial size of the linear memory
  --max-memory=<value>   Maximum size of the linear memory
  --no-entry             Do not output any entry point
  --relocatable          Create relocatable object file

@Arnavion
Copy link
Author

Arnavion commented Mar 7, 2018

  • --import-memory works perfectly.

  • --relocatable does have the effect that globals are defined for each global, but the codegen still uses the global offsets directly. This might be wasm2wat lying to me; I have to double-check the actual structures myself to confirm.

  • --emit-relocs seems to have no effect when used on top of --relocatable. It might be implicitly implied.

  • --global-base seems to have no effect. Globals start at 0 regardless of the value given here.

So this doesn't seem to be fixed, but is probably better served by llvm/lld's issue tracker anyway.

@alexcrichton
Copy link
Member

@Arnavion oh I know for a fact that --relocatable isn't what you want in that I think it's the same as ld -r which means "given a bunch of object files make a new object file", which here you'd want a final output. I think --emit-relocs is similar in that it probably doesn't have to do much with this.

For me though --global-base seems to be working?

$ cat foo.rs
#![crate_type = "cdylib"]

#[no_mangle]
pub extern fn foo() -> *const u8 {
    "foo".as_ptr()
}
$ rustc +nightly foo.rs -C link-args=--global-base=128 -O -C lto --target wasm32-unknown-unknown && wasm-gc foo.wasm && wasm2wat foo.wasm
(module
  (type (;0;) (func (result i32)))
  (func $foo (type 0) (result i32)
    i32.const 128)
  (table (;0;) 1 1 anyfunc)
  (memory (;0;) 2)
  (export "memory" (memory 0))
  (export "foo" (func $foo))
  (data (i32.const 128) "foo"))
$ rustc +nightly foo.rs -C link-args=--global-base=1000 -O -C lto --target wasm32-unknown-unknown && wasm-gc foo.wasm && wasm2wat foo.wasm
(module
  (type (;0;) (func (result i32)))
  (func $foo (type 0) (result i32)
    i32.const 1000)
  (table (;0;) 1 1 anyfunc)
  (memory (;0;) 2)
  (export "memory" (memory 0))
  (export "foo" (func $foo))
  (data (i32.const 1000) "foo"))

@Arnavion
Copy link
Author

Arnavion commented Mar 7, 2018

Ah. It doesn't work when you combine it with --relocatable - the globals start from 0. When I take it out it does work, and as you said it's the wrong thing to use anyway. Everything's good, then. Thanks!

@Arnavion
Copy link
Author

Arnavion commented Mar 7, 2018

I dumped the WASM structures and now I see that --relocatable does actually do what I thought it would. I was thrown off by the fact that functions were using global offsets directly and wasm2wat wasn't showing the custom reloc and linking sections. In fact the sections are there, and a linker / loader / post-processor would use the reloc sections to locate global references that need to be modified and the linking section to know what globals they correspond to. So what I wrote in the OP can be done.

Edit: And it makes sense why --global-base was being ignored when --relocatable was supplied, because global base address makes no sense for a relocatable module anyway.

@matt-williams
Copy link

I was struggling to get --global-base working, even trying the example @alexcrichton included above (#46645 (comment)). Just a note in case anyone else hits the same problem and finds this issue...

It looks like, in the latest ldd, stack defaults to coming first, and this pushes the globals to start at 1048576. (I think this might be due to the use of --stack-first, as added in #50543.)

$ cat foo.rs
#![crate_type = "cdylib"]

#[no_mangle]
pub extern fn foo() -> *const u8 {
    "foo".as_ptr()
}
$ rustc +nightly foo.rs -C link-args=--global-base=128 -O -C lto --target wasm32-unknown-unknown && wasm-gc foo.wasm && wasm2wat foo.wasm
(module
  (type (;0;) (func (result i32)))
  (func $foo (type 0) (result i32)
    i32.const 1048576)
  (table (;0;) 1 1 anyfunc)
  (memory (;0;) 17)
  (global (;0;) i32 (i32.const 1048579))
  (global (;1;) i32 (i32.const 1048579))
  (export "memory" (memory 0))
  (export "__indirect_function_table" (table 0))
  (export "__heap_base" (global 0))
  (export "__data_end" (global 1))
  (export "foo" (func $foo))
  (data (i32.const 1048576) "foo"))

My solution (as I needed to get my globals to be below 65536) was to change the size of the stack, using the --stack-size linker argument.

$ rustc +nightly foo.rs -C link-args=-zstack-size=1024 -O -C lto --target wasm32-unknown-unknown && wasm-gc foo.wasm && wasm2wat foo.wasm
(module
  (type (;0;) (func (result i32)))
  (func $foo (type 0) (result i32)
    i32.const 1024)
  (table (;0;) 1 1 anyfunc)
  (memory (;0;) 1)
  (global (;0;) i32 (i32.const 1027))
  (global (;1;) i32 (i32.const 1027))
  (export "memory" (memory 0))
  (export "__indirect_function_table" (table 0))
  (export "__heap_base" (global 0))
  (export "__data_end" (global 1))
  (export "foo" (func $foo))
  (data (i32.const 1024) "foo"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. O-wasm Target: WASM (WebAssembly), http://webassembly.org/
Projects
None yet
Development

No branches or pull requests

4 participants