Skip to content

Guidance for wasm target vectorization? #2196

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jeremy-coleman opened this issue Oct 21, 2021 · 7 comments
Open

Guidance for wasm target vectorization? #2196

jeremy-coleman opened this issue Oct 21, 2021 · 7 comments
Labels
enhancement New feature or request

Comments

@jeremy-coleman
Copy link

Hi. I have a few questions / request for guidance regarding wasm target (for web).

Does the tinygo llvm pass try to do autovectorization on go code?

Does the tinygo clang pass on c imports do autovec like emscripten? Generally, Im unclear on the roles of clang vs emscripten. I know emcc shims coreutils/sdl/etc and does some code size related stuff. I also know clang does the initial autovectorization , but is clang targeting something like sse4 then emcc translates that to wasm simd ? Or can clang autovec to a wasm target (and thus tinygo too)?

Does the tinygo llvm ir and any c/clang llvm ir get merged together before codegen happens? Or just kind of linked together? I am a complete novice here.(please dont spend much effort answering this, i have a feeling an answer could potentially be nearly infinitely complex)

Do the clang/llvm compiler args affect both the c and go output or is there individual configs for each? I have read the docs (several times) , but i am still unclear on how it all fits together.

i guess the tldr is, if i want to write code that can be autovectorized into a wasm module, is tinygo with either go or c a good fit?

@aykevl
Copy link
Member

aykevl commented Oct 21, 2021

TinyGo doesn't do any autovectorization. LLVM might do it, but only when the appropriate extensions are enabled.

Generally, Im unclear on the roles of clang vs emscripten. I know emcc shims coreutils/sdl/etc and does some code size related stuff.

Emscriptem is a whole compiler toolchain, that includes Clang, wasm-opt (to optimize the resulting wasm), and lots of shims to translate some common APIs like SDL to web equivalents. So Clang is just a component of that.

To be clear:

  • LLVM is a compiler toolkit that provides all kinds of things useful for compilers (here is where autovectorization happens)
  • clang is a compiler (that uses LLVM): it converts C/C++ etc into object code
  • emscriptem is a cross compilation toolchain: it bundles clang and a load of other things together to make compiling for the web easy
  • TinyGo also uses LLVM (and sometimes Clang for C code) and also ships with some libraries, but those are of course very different as we're talking about Go here instead of C.

also know clang does the initial autovectorization , but is clang targeting something like sse4 then emcc translates that to wasm simd ? Or can clang autovec to a wasm target (and thus tinygo too)?

Probably not SSE4, which is a x86 thing (not wasm). I'd guess that if the proper extension is enabled (+simd128) it will try to autovectorize some things.

Can you give a bit more background? Is there something you'd like to do but that's too slow at the moment?

@jeremy-coleman
Copy link
Author

Hey, thanks for your response. Im doing web graphics stuff atm, so perf:user experience are directly related. With web graphics being often cpu bound, simd could be a big win. I guess just kind of a generally always need moooarrr. Btw, I mentioned sse4 specifically just because i think i remember v8 has/had checks up to sse4, it maybe here somewhere . https://source.chromium.org/chromium/chromium/src/+/main:v8/src/wasm/function-body-decoder-impl.h.

I know assemblyscript just enables the builtin simd opcodes when you enable simd (without any vectorizing). Since I have no idea how llvm works, i currently imagine it is equivalent to teaching acorn/babel new ast types, but not necessarily transforming anything. It will be really cool if tinygo can vectorize go code with the simd128 flag.

@aykevl
Copy link
Member

aykevl commented Oct 21, 2021

You could try to add the flag -llvm-features=+simd128. I haven't tested it, but it might work.
In general, TinyGo wasn't optimized for speed. So you will likely find many bottlenecks.
If you aren't already, you should use -opt=2, which is like -O2 in GCC/Clang (the default is -opt=z, which is like -Os in GCC or -Oz in Clang). It can sometimes make a big difference.

Btw, I mentioned sse4 specifically just because i think i remember v8 has/had checks up to sse4

Maybe they use SSE4 for something, but SSE is x86 only. It doesn't exit on ARM, MIPS, or WebAssembly. Only x86. However, it is very likely that they will convert WebAssembly SIMD instructions to SSE instructions.

@deadprogram deadprogram added the enhancement New feature or request label Oct 30, 2021
@codefromthecrypt
Copy link
Contributor

PS recent TinyGo is already using >1.0 (a.k.a. MVP) features, so optimizing for this in wasm could be useful. Notably SIMD is a part of the draft WebAssembly 2.0 Core spec. https://webassembly.github.io/spec/core/appendix/changes.html

@aykevl
Copy link
Member

aykevl commented Sep 15, 2022

For anyone looking how to optimize TinyGo binaries, the tinygo.org website will have a page for this with the next release: tinygo-org/tinygo-site#287

@rockwotj
Copy link
Contributor

BTW I've confirmed that -llvm-features=+simd128 does generate some v128 wasm instructions.

@hpvd
Copy link

hpvd commented Apr 17, 2025

fyi. there is a pretty long discussion on SIMD in big-go:
golang/go#67520

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants