Skip to content

"local count too large" with large generated C++ function in debug mode #19346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
skelly-energid opened this issue May 12, 2023 · 26 comments
Open

Comments

@skelly-energid
Copy link

skelly-energid commented May 12, 2023

Please include the following in your bug report:

Version of emscripten/emsdk:
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.34 (57b21b8)
clang version 17.0.0 (https://github.com/llvm/llvm-project a031f72187ce495b9faa4ccf99b1e901a3872f4b)
Target: wasm32-unknown-emscripten
Thread model: posix

Using this python script to generate the code:

#!/usr/bin/env python3

import os
import sys

code = ""

for i in range(0, 1200):
    code += f"""case {i}:
            otherfunc(arg2, "foo", local2, local1, local3, arg1, arg3, arg4, arg5, arg6, arg7, arg8);
            break;
    """

with open("generated.cpp", "w") as f:
    f.write(
f"""

void otherfunc (...) {{
}}

void somefunc(int arg1, int arg2, char arg3, int arg4, int arg5, int arg6, int arg7, int arg8) {{
  char local1 = 0;
  char local2 = 0;
  char local3 = 0;

  switch (arg4) {{
    {code}
    default:
      break;
  }}
}}

int main(int, char**)
{{
    somefunc(0, 0, 's', 0, 0, 0, 0, 0);
    return 0;
}}

""")

and this command line:

../generate.py && em++   -g -o main.html generated.cpp

Chrome fails with something like

main.js:870 Uncaught (in promise) RuntimeError: Aborted(CompileError: WebAssembly.instantiate(): Compiling function #2 failed: local count too large @+452)

Firefox fails with something like

Uncaught (in promise) RuntimeError: Aborted(CompileError: wasm validation error: at offset 455: too many locals)

I don't really know what is meant by locals here, because there are not many local variables.

@sbc100
Copy link
Collaborator

sbc100 commented May 12, 2023

Does the problem go away when you run with -O1 or -O2?

Can you try also with -sERROR_ON_WASM_CHANGES_AFTER_LINK -sWASM_BIGINT? There are two possible causes here, firstly llvm could be generating a function with that many locals, secondly one of the binaryen optimizations could be causing it. If it still fails with -sERROR_ON_WASM_CHANGES_AFTER_LINK -sWASM_BIGINT then we would know it was llvm that was at fault. I'm not sure what we can do about that though.. it might be simplest to suggest that you break up your huge function.

@skelly-energid
Copy link
Author

skelly-energid commented May 12, 2023

Thanks, yes, it is only reproducible in debug mode. If I add -O1 I can't reproduce it anymore. I can still reproduce it without -O1 and with -sERROR_ON_WASM_CHANGES_AFTER_LINK -sWASM_BIGINT.

I've already broken up the generated function, and that has solved the problem for me. As you say though, this could still be a llvm bug.

@mike-lischke
Copy link

I got the same error when I enabled Asan via build parameter -fsanitize=address. What works is -fsanitize=undefined, though.

@sbc100
Copy link
Collaborator

sbc100 commented Jul 24, 2023

@mike-lischke, does adding -sERROR_ON_WASM_CHANGES_AFTER_LINK and -sWASM_BIGINT fix the issue for you?

@mike-lischke
Copy link

I'm on travel with limited internet connection. Will check when I return in 2 weeks.

@mike-lischke
Copy link

mike-lischke commented Aug 9, 2023

@sbc100 I have now tested your suggestion and get this error:

em++: error: changes to the wasm are required after link, but disallowed by ERROR_ON_WASM_CHANGES_AFTER_LINK: ['/opt/homebrew/Cellar/emscripten/3.1.43/libexec/binaryen/bin/wasm-emscripten-finalize', '--output-source-map-url=antlr4-runtime-wasm.wasm.map', '-g', '--bigint', '--no-dyncalls', '--no-legalize-javascript-ffi', '--dwarf', 'wasm/antlr4-runtime-wasm.wasm', '-o', 'wasm/antlr4-runtime-wasm.wasm']

Unclear why a change to the wasm file would be necessary. The full command line is:

em++ \
wasm/antlr4-runtime-wrapper.cpp \
antlr4-cpp-runtime/*.cpp\
antlr4-cpp-runtime/atn/*.cpp\
antlr4-cpp-runtime/dfa/*.cpp\
antlr4-cpp-runtime/internal/*.cpp\
antlr4-cpp-runtime/misc/*.cpp\
antlr4-cpp-runtime/support/*.cpp\
antlr4-cpp-runtime/tree/*.cpp\
antlr4-cpp-runtime/tree/pattern/*.cpp\
antlr4-cpp-runtime/tree/xpath/*.cpp\
-O0\
-std=c++17 -lembind -Iantlr4-cpp-runtime/ \
-fwasm-exceptions\
-o wasm/antlr4-runtime-wasm.js\
-s EXPORT_ES6=1\
-s WASM=1\
-s WASM_BIGINT=1\
-s ALLOW_MEMORY_GROWTH=1\
-s DEMANGLE_SUPPORT=1\
-g\
-gsource-map\
-s ERROR_ON_WASM_CHANGES_AFTER_LINK=1\
-fsanitize=address

@sbc100
Copy link
Collaborator

sbc100 commented Aug 9, 2023

Can you try without -gsource-map?

@mike-lischke
Copy link

Also without source maps the same error comes up (failed to asynchronously prepare wasm: CompileError: WebAssembly.instantiate(): Compiling function #77:"embind_init_main()" failed: local count too large @+51266).

@sbc100
Copy link
Collaborator

sbc100 commented Aug 10, 2023

And this was with -sERROR_ON_WASM_CHANGES_AFTER_LINK? i.e. did you manage to link with that flag?

What other flags are you using? I know -fsanitize=address can cause a lot of extra locals to get added. We have to disable several tests under asan because of the local count too large" issue. Do you see the this error even without -fsanitize=address?

@mike-lischke
Copy link

mike-lischke commented Aug 11, 2023

Yes, I could link without -gsource-map. The "local count too large" comes up when I run the test script. These are the flags I use now:

    -O0\
    -std=c++17 -lembind -Iantlr4-cpp-runtime/ \
    -o wasm/antlr4-runtime-wasm.js\
    -s EXPORT_ES6=1\
    -s WASM=1\
    -s WASM_BIGINT=1\
    -s ALLOW_MEMORY_GROWTH=1\
    -s DEMANGLE_SUPPORT=1\
    -s ERROR_ON_WASM_CHANGES_AFTER_LINK=1\
    -fwasm-exceptions\
    -fsanitize=address\
    -g

Using -fsanitize=undefined instead makes the app running, but then I cannot debug that memory alignment issue I have.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 11, 2023

I see, so this seems like an issue with -fsanitize=address resulting in too many locals being created. This will most likely require an upstream llvm change, and I'm not quite sure that it would look like.

BTW, do you have any information about the function in question (the one with too many locals): embind_init_main? Is it a particularly long or complex function?

One possible workaround here would be to split this function up at the source level into several smaller functions. In the case of EMSCRIPTEN_BINDINGS (which it looks like this is) you could create several blocks such as :

EMSCRIPTEN_BINDINGS(main1, {
...
}

EMSCRIPTEN_BINDINGS(main2) {
...
}
etc.

@mike-lischke
Copy link

mike-lischke commented Aug 11, 2023

Somehow I anticipated this question :-) But no, this is not from my code. I searched my entire project and could not find it, not even in generated code. How much are "to many locals"? However, since I want to make all public classes available in JS/TS, I have bound many classes already. If the EMSCRIPTEN_BINDINGS is the culprit I can certainly structure that differently.

What are those names in that macro call actually? The documentation doesn't really explain them. Do they matter in some way?

Update: That was it! I split the main bindings (around 2000 lines, 53 classes) into 6 separated binding functions and now I can execute the code and get an error report:

SUMMARY: AddressSanitizer: heap-use-after-free (file:///Volumes/Extern/Work/projects/antlr4wasm/wasm/antlr4-runtime-wasm.wasm+0x84f1f1) in antlr4::atn::LexerATNSimulator::match(antlr4::CharStream*, unsigned long)+0x84f1f1

That's great! Many thanks for your help Sam!

@sbc100
Copy link
Collaborator

sbc100 commented Aug 11, 2023

The name passed as the first argument to EMSCRIPTEN_BINDINGS doesn't really mean anything.. it just gives the generated function a unique name. So you can use whatever name you want as long as don't repeat names (since that will result in multiple symbols with the same name.

@mmomtchev
Copy link

@sbc100 I have the same problem when compiling a huge 300k lines SWIG-generated C++ file. In my case the problem is very real - it happens in a function which registers all the exported classes which has 768 blocks of this type:

do {
 // register class and methods
} while(0);

Every block contains 4 local variables. When I use -O2 the compiler must optimize some (a large part?) of those variables and everything works. Otherwise I get the local count too large. Isn't it possible to configure this limit somewhere?

@sbc100
Copy link
Collaborator

sbc100 commented Oct 11, 2023

I believe the local count too large error is coming from the wasm runtime (e.g. v8) and I don't think it configurable, no.

We could perhaps have binaryen (wasm-opt) and/or llvm try to limit the number of locals they allow themselves to use. It might be easier on the binaryen side but I'm not sure. @tlively @kripken WDTY?

In the short term I'm afraid that your best best is probably to attempt to break up your large function into smaller ones.

@kripken
Copy link
Member

kripken commented Oct 11, 2023

Binaryen does do this when it optimizes. I think that's why as reported above this works with -O2. It's just a problem in unoptimized builds.

As a workaround, you can do some manual optimization of the wasm like this: wasm-opt --simplify-locals --coalesce-locals a.wasm -o a.wasm, which in most cases should fix this (an entire run of -O would be even better, but much slower).

@mmomtchev
Copy link

@kripken Alas, wasm-opt produces a binary that segfaults on instantiateStreaming. I had to also add --enable-bulk-memory as directed.

Finally, building just that one source file with -O1 while the rest is with -O0 produces a working binary, so this is acceptable to me for debugging.

@yowl
Copy link

yowl commented Jan 24, 2024

I notice that llvm geps where the offset is a constant produce a local in the wat which is unneeded and optimised away with O1 . It would seem possible to eliminate that.

IoIxD added a commit to raylib-rs/raylib-rs that referenced this issue May 7, 2024
* build, callbacks: fix/clarify wasm building

* build: wasm building requires -O3 to be set in EMCC_CFLAGS, apperently. see emscripten-core/emscripten#19346

* bindings: moved functions for handling log callbacks over to c code

* bindings: use our own raylib.h in the c code

* audio: rename the 'ready' functions to match their function names

* test: updated test to match AsRef PR

* audio: brought back PhantomData for RaylibAudio so that the struct is and !Send and !Sync

* audio: AsRef for more functions

* raylib/cargo, raylib-sys/cargo: realized i had left in some unused imports

* callbacks: make custom_trace_log_callback public as a stab in the dark to see if that fixes the windows issue

* callbacks,tests: realized that since rust marks our external function as having 'char *' as the first argument on windows-gnu

* callbacks: realized the c code wasn't wrapped in extern 'c' when compiling in c++

* callbacks: have windows compile a cpp file instead of a c file for utils_log

* fixed the windows link error but now it complains about the char fix i did

* log: vsprintf -> vsnprintf

* log: use the result of vsnprintf instead of strlen

* build: removed some flags i forgot i added.

* log: actually, just use from_ptr, because what raylib gives us is always going to end in \0

* tests: disable callbacks test under windows after confirming that the issue literally only happens when we use cargo test

* tests: disable callbacks test under windows after confirming that the issue literally only happens when we use cargo test

* ci: remove windows test after confirming it only fails under CI. ran cargo clippy --fix
@kohakukun
Copy link
Contributor

I'm getting the same issue, the difference is that, if I enable your suggested flag @sbc100 ERROR_ON_WASM_CHANGES_AFTER_LINK I get the error:
em++: error: changes to the wasm are required after link, but disallowed by ERROR_ON_WASM_CHANGES_AFTER_LINK: ['emsdk/upstream/bin/wasm-emscripten-finalize', '-g', '--bigint', '--no-legalize-javascript-ffi', '--check-stack-overflow', '../../bin/release/out.wasm', '-o', '../../bin/release/out.wasm']

Also, I notice I get this error when addind the ASYNCIFY option

@sbc100
Copy link
Collaborator

sbc100 commented Jun 20, 2024

@kohakukun can you remove share the original link command that caused the failure? Along with the failure message itself? Does it happen if you add -O1?

@MikhailGorobets
Copy link

MikhailGorobets commented Jul 9, 2024

I also have the problem with the ASYNCIFY flag

@xbcnn
Copy link
Contributor

xbcnn commented Aug 19, 2024

I met same issue with ASYNCIFY too.

@bartadaniel
Copy link

I'm having the same issue after compiling including FFmpeg for debug. If I exclude the decoders, it works, but that is far from optimal.

	emcc ./lib/web-demuxer/*.c ./lib/web-demuxer/*.cpp \
		-lembind \
		-I./lib/FFmpeg \
		-L./lib/FFmpeg/libavformat -lavformat \
		-L./lib/FFmpeg/libavutil -lavutil \
		-L./lib/FFmpeg/libavcodec -lavcodec \
		--post-js ./lib/web-demuxer/post.js \
		-lworkerfs.js \
		-s EXPORT_ES6=1 \
		-s INVOKE_RUN=0 \
		-s ENVIRONMENT=worker \
		-s ASYNCIFY \
		-s ALLOW_MEMORY_GROWTH=1 \
	        -O0 \
	        -g \
	        --minify 0 \
	        --profiling \
	        -sWASM_BIGINT \
	        --closure-args=--debug

@kripken
Copy link
Member

kripken commented Oct 25, 2024

Does the workaround of manually optimizing the wasm not work for anyone?

I see a comment about a "segfault on instantiateStreaming" in that case, but that sounds like a separate browser issue. If we can get a testcase for it, we can file a bug on the relevant browser.

Otherwise, there isn't anything more we can do: reducing the amount of locals is something optimizations do, so this browser limit on locals can be hit in debug builds, unfortunately. Always running wasm-opt even in debug builds is a possibility to avoid this, but it would make all debug builds slower, so I think that wouldn't be the best option. Instead, running wasm-opt manually on the rare cases this happens seems best.

@yomaytk
Copy link

yomaytk commented Nov 11, 2024

@kripken
I successfully executed with -O2 or -O3.
I got a warning Some VMs may not accept this binary because it has a large number of locals in function when I compiled the program that includes the relatively large function "without optimization" (i.e. -O0) and got the same error local found too large when I executed the generated wasm.
So it works with optimization at least in my case (because optimization pass removed the many local variables?).

@bartadaniel
Copy link

bartadaniel commented Dec 20, 2024

I can compile with -O2 or -O3 but that makes run-time source-level debugging using the DWARF info wonky. I assume this is the expected behavior with all the heavy optimizations going on.
It would be helpful to have an option to disable these hard limits in the browser during development to avoid the issue. I'll raise the question to the Chromium team.

edit: link to the chromium issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests