-
Notifications
You must be signed in to change notification settings - Fork 787
Add a prebuilt version for Apple Silicon (M1) #4334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @sbc100 |
@alexcrichton is this something wasm-pack can do or should we try to do it here in the binaryen CI? Do you want to have go at this? Even if we don't build an arm binary wasm-pack should be able to download and use the x86 version right? That might be the quickest way to get things working. |
Ah I don't work on wasm-pack any more, so I don't know where this would best be done. |
@d3lm can you open a bug on wasm-pack and suggest they just the x86-64 binary when running on arm? (for now). |
@sbc100 Yep I can do that. Thanks. |
But just to understand, the x86-64 does not work out of the box on arm right? You'd have to run this via rosetta no? |
My understanding is the M1 macs do support running x86-64 out of the box (i.e. they ship with whatever tooling they need to make this work). |
My understanding is that x86-64 binaries can be executed transparently without any special commands and launch sequences. |
Ah ok, and wasm-pack doesn't have a prebuilt x86-64? |
Binaryen does provide and x86-64 prebuilt of wasm-opt. I'm suggesting the wasm-opt look for this binary rather than the aarch64 (arm64) version (which doesn't not exist). |
Aha! Got it. Thanks a lot. |
Ok, after some discussion I still think it would be best to have a Tool::WasmOpt => {
Ok(format!(
"https://github.com/WebAssembly/binaryen/releases/download/{vers}/binaryen-{vers}-{target}.tar.gz",
vers = "version_90",
target = target,
))
} Why can't we add a build for macos-12 too? It's quite common and I think it would be right to ship a release for it as part of binaryen. |
So the binaryen filenames are of the form Of course, if you want to get the last bit of performance benefit from having a native arm64 build, and you would like to contribute the github action recipe for building it that would be most welcome. But I don't see why you can't use the x86_64 version for now. As for targeting a certain/newer macos version, what would be point of that? I don't know of any benefit of targeting a more recent version, but maybe I'm missing something? I think we we want to be as compatible as can with all versions (within reason) which means setting a low minimum version when build, right? |
Because that requires rosetta and it'd be great if we could avoid rosetta and have a native arm64 build. Or am I wrong? |
I see if I can contribute and submit an action for building it for arm64. |
I actually think that building for Apple Sillicon in a GitHub action is not really possible right now, because there's no environment that runs an M1. Tho, you could pass |
Why is avoiding rosetta an important goal here? Don't all M1 macs ship with rosetta builtin? I get that it would be nice to have.. but it doesn't seem particularly urgent or important? Do you have an M1 mac that doesn't have rosetta installed? Or are you worried about that overhead of running wasm-opt in emulation mode? (is your wasm-opt phase very slow?) |
Yes, if we wanted to have github build these binaries I imagine it would be cross compile.. i.e. produce an arm64 build on x86_64 hardware. |
I think so yes.
No, because I think it's installed by default.
Yes, but that may be unjustified worries, because I have not yet seen any performance issues. For the time being, I submitted a PR to wasm-pack rustwasm/wasm-pack#1088 that downloads the |
Great! That sounds a good place to be for now. |
Another option here is to make a wasm build of wasm-opt. With node-pthreads support + exceptions support that should run pretty fast in recent node. Not as fast as a native M1 build I'm sure, but it might be faster than an |
@kripken I actually had the same idea! That would work too which would eliminate the interoperability issues across different systems. |
I did a little testing of that now using this diff: diff --git a/CMakeLists.txt b/CMakeLists.txt
index e540b1f57..0dc9c204b 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -265,26 +265,32 @@ else()
add_nondebug_compile_flag("-UNDEBUG")
endif()
endif()
if(EMSCRIPTEN)
# link with -O3 for metadce and other powerful optimizations. note that we
# must use add_link_options so that this appears after CMake's default -O2
add_link_options("-O3")
add_link_flag("-s SINGLE_FILE")
add_link_flag("-s ALLOW_MEMORY_GROWTH=1")
- add_compile_flag("-s DISABLE_EXCEPTION_CATCHING=0")
- add_link_flag("-s DISABLE_EXCEPTION_CATCHING=0")
+ add_compile_flag("-fwasm-exceptions")
+ add_link_flag("-fwasm-exceptions")
+ add_compile_flag("-pthread")
+ add_link_flag("-pthread")
+ add_link_flag("-s PROXY_TO_PTHREAD")
+ add_link_flag("-s EXIT_RUNTIME")
+ add_link_flag("-Wno-pthreads-mem-growth")
# make the tools immediately usable on Node.js
add_link_flag("-s NODERAWFS")
# in opt builds, LTO helps so much (>20%) it's worth slow compile times
- add_nondebug_compile_flag("-flto")
+ #add_nondebug_compile_flag("-flto")
endif()
# clang doesn't print colored diagnostics when invoked from Ninja
if(UNIX AND CMAKE_GENERATOR STREQUAL "Ninja")
if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
add_compile_flag("-fdiagnostics-color=always")
elseif(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
add_compile_flag("-fcolor-diagnostics")
endif()
endif()
diff --git a/src/support/threads.cpp b/src/support/threads.cpp
index ab9de4175..7aec720ba 100644
--- a/src/support/threads.cpp
+++ b/src/support/threads.cpp
@@ -132,24 +132,25 @@ void ThreadPool::initialize(size_t num) {
threads.clear();
return;
}
}
DEBUG_POOL("initialize() waiting\n");
condition.wait(lock, [this]() { return areThreadsReady(); });
DEBUG_POOL("initialize() is done\n");
}
size_t ThreadPool::getNumCores() {
-#ifdef __EMSCRIPTEN__
+#if defined(__EMSCRIPTEN__) && !defined(__EMSCRIPTEN_PTHREADS__)
return 1;
#else
size_t num = std::max(1U, std::thread::hardware_concurrency());
if (getenv("BINARYEN_CORES")) {
num = std::stoi(getenv("BINARYEN_CORES"));
}
return num;
#endif
}
ThreadPool* ThreadPool::get() {
DEBUG_POOL("::get()\n");
// lock on the creation Everything works as expected when I optimize some large files, but it is surprisingly slow. It is using multiple cores, and in fact uses them more than a native build. Native has this:
and node 16.5 has this:
I'm not sure what's going wrong here. Perhaps the slower wasm atomics are hurting us since they are all sequentially consistent...? Anyhow, sadly using wasm isn't a easy solution for this issue. |
Thanks so much for investigating a WASM version of It's a bummer that it's so much slower 😞 but I can also imagine that Atomics are slowing it down a lot. I can't imagine that using I'd love to loop @tschneidereit in as well. Maybe you have an idea why it would be significantly slower than native? I suppose we have to wait for an M1 environment for GitHub actions or build an |
If we have a recent enough SDK on our existing mac builder, It should be very straightforward to cross-compile for aarch64. |
We now have MacOS Arm releases. |
wasm-pack
useswasm-opt
and it tries to download the binary for it when it gets executed. However, if you run this on an M1 it gives the following error:It would be great if there was a prebuilt version for ARM so wasm-opt runs out of the box on an M1 Mac.
The text was updated successfully, but these errors were encountered: