Skip to content

Question on usage of libc functions in Wasm #15296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
miladfarca opened this issue Oct 14, 2021 · 11 comments
Closed

Question on usage of libc functions in Wasm #15296

miladfarca opened this issue Oct 14, 2021 · 11 comments

Comments

@miladfarca
Copy link
Contributor

miladfarca commented Oct 14, 2021

I have a question on using libc functions such as malloc on Big Endian platforms.

Wasm is Little Endian enforced. On BE platforms, JS engines such as V8 are responsible to reverse the bytes of every load/store instruction at runtime to make sure their behaviour matches LE platforms like x64.

My question is what happens when system libraries are linked with Wasm on BE machines? take this snippet as an example:

int main(){
  int *p = (int*) malloc(sizeof(int));
 // ....
  return 0;
}
  • libc malloc will store a pointer value in BE order in memory.
  • Wasm will load this value in reverse at runtime, as every load/store instruction is supposed to reverse its bytes.
  • Now we have a pointer to an undefined/garbage location.

Does emscripten use the system malloc implementation at runtime to do this or it has its own implementation bundled with the final wasm binary file?

@kripken

@sbc100
Copy link
Collaborator

sbc100 commented Oct 14, 2021

The only libc that can be linked into an emscripten project is one that is already built for WebAssembly. In practice the one we use is a custom version of Musl libc which is included with emscripten itself. In other words, there is no way to link the system version of libc/malloc into your program.

@miladfarca
Copy link
Contributor Author

Thank you, would you be able to point me to its implementation, is this the right place to find it?
https://github.com/emscripten-core/emscripten/tree/main/system/lib/libc/musl/src/stdlib

@sbc100
Copy link
Collaborator

sbc100 commented Oct 14, 2021

Yup, technically it one level up from that but you got the right idea.

There is also a README that explains where it comes from: https://github.com/emscripten-core/emscripten/blob/main/system/lib/libc/README.md

I'm working on an update to the latest version of musl right now: #13006. As part of that I'm using an external repro to track our local change: https://github.com/emscripten-core/musl

@miladfarca
Copy link
Contributor Author

miladfarca commented Oct 14, 2021

Thank you, so I guess the malloc implementation is under emscripten-core/musl and not main/system/lib/libc.

Either way I think the problem remains. We have attempted an initial fix of BE platforms in this PR: #13413
It fixes the usage of TypedArrays.

My question now is, do we need to also fix the libc implantation in this repository to also load/store values in reverse on BE platforms?

@sbc100
Copy link
Collaborator

sbc100 commented Oct 14, 2021

The default allocator we use is dlmalloc: https://github.com/emscripten-core/emscripten/blob/main/system/lib/dlmalloc.c.

I don't see how malloc is any different to any other native code. How is it different to any other native function that returns a pointer?

@miladfarca
Copy link
Contributor Author

miladfarca commented Oct 14, 2021

Not different at all, just wanted to use malloc as example to learn what happens behind the scene.

So wasm runtime (V8 runtime) makes a call to a libc function which is also provided by emscripten. The libc implementation however is not LE enforced, I assume it passes the output back to V8 in BE order which then gets reversed by V8, i.e there is currently no bridge between libc and V8 within emscripten to reverse the values passed to each other correct?

@sbc100
Copy link
Collaborator

sbc100 commented Oct 14, 2021

I don't know how V8 runs wasm on BE architectures... does this work at all?

I don't think you need to consider malloc or libc as special since they are not.. they are just ordinary functions. You can consider a much simpler wasm function:

int foo() { return 42; }

Just like malloc, this function returns an i32.

@sbc100
Copy link
Collaborator

sbc100 commented Oct 14, 2021

Perhaps @kripken would better understand that intent behind this question?

@miladfarca
Copy link
Contributor Author

miladfarca commented Oct 14, 2021

Thanks for clarifying, so essentially emscripten compiles every library it needs to wasm at compile time? i.e it doesn't link to it with a raw compiled libc function and jump to it at runtime?

@sbc100
Copy link
Collaborator

sbc100 commented Oct 14, 2021

Yes, there is no way you can link any host libraries into an emscripten project. Its simply not possible.

@miladfarca
Copy link
Contributor Author

Thanks again for clarifying, I will now close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants