Skip to content

Add a prebuilt version for Apple Silicon (M1) #4334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
d3lm opened this issue Nov 16, 2021 · 26 comments
Closed

Add a prebuilt version for Apple Silicon (M1) #4334

d3lm opened this issue Nov 16, 2021 · 26 comments

Comments

@d3lm
Copy link

d3lm commented Nov 16, 2021

wasm-pack uses wasm-opt and it tries to download the binary for it when it gets executed. However, if you run this on an M1 it gives the following error:

Error: no prebuilt wasm-opt binaries are available for this platform: Unrecognized target!

It would be great if there was a prebuilt version for ARM so wasm-opt runs out of the box on an M1 Mac.

@kripken
Copy link
Member

kripken commented Nov 17, 2021

cc @sbc100

@sbc100
Copy link
Member

sbc100 commented Nov 17, 2021

@alexcrichton is this something wasm-pack can do or should we try to do it here in the binaryen CI? Do you want to have go at this?

Even if we don't build an arm binary wasm-pack should be able to download and use the x86 version right? That might be the quickest way to get things working.

@alexcrichton
Copy link
Contributor

Ah I don't work on wasm-pack any more, so I don't know where this would best be done.

@sbc100
Copy link
Member

sbc100 commented Nov 18, 2021

@d3lm can you open a bug on wasm-pack and suggest they just the x86-64 binary when running on arm? (for now).

@d3lm
Copy link
Author

d3lm commented Nov 19, 2021

@sbc100 Yep I can do that. Thanks.

@d3lm
Copy link
Author

d3lm commented Nov 19, 2021

But just to understand, the x86-64 does not work out of the box on arm right? You'd have to run this via rosetta no?

@sbc100
Copy link
Member

sbc100 commented Nov 19, 2021

My understanding is the M1 macs do support running x86-64 out of the box (i.e. they ship with whatever tooling they need to make this work).

@sbc100
Copy link
Member

sbc100 commented Nov 19, 2021

My understanding is that x86-64 binaries can be executed transparently without any special commands and launch sequences.

@d3lm
Copy link
Author

d3lm commented Nov 19, 2021

Ah ok, and wasm-pack doesn't have a prebuilt x86-64?

@sbc100
Copy link
Member

sbc100 commented Nov 19, 2021

Binaryen does provide and x86-64 prebuilt of wasm-opt. I'm suggesting the wasm-opt look for this binary rather than the aarch64 (arm64) version (which doesn't not exist).

@d3lm
Copy link
Author

d3lm commented Nov 20, 2021

Aha! Got it. Thanks a lot.

@d3lm
Copy link
Author

d3lm commented Dec 3, 2021

Ok, after some discussion I still think it would be best to have a osx_aarch64 release of binaryen because wasm-pack simply downloads it like this

Tool::WasmOpt => {
  Ok(format!(
    "https://github.com/WebAssembly/binaryen/releases/download/{vers}/binaryen-{vers}-{target}.tar.gz",
     vers = "version_90",
     target = target,
   ))
}

Why can't we add a build for macos-12 too? It's quite common and I think it would be right to ship a release for it as part of binaryen.

@sbc100
Copy link
Member

sbc100 commented Dec 3, 2021

So the binaryen filenames are of the form binaryen-version_100-x86_64-macos.tar.gz so why can't wasm-pack simply do if (target == aarch64-macos) target = x86_64-macos right before trying to download?

Of course, if you want to get the last bit of performance benefit from having a native arm64 build, and you would like to contribute the github action recipe for building it that would be most welcome. But I don't see why you can't use the x86_64 version for now.

As for targeting a certain/newer macos version, what would be point of that? I don't know of any benefit of targeting a more recent version, but maybe I'm missing something? I think we we want to be as compatible as can with all versions (within reason) which means setting a low minimum version when build, right?

@d3lm
Copy link
Author

d3lm commented Dec 3, 2021

Because that requires rosetta and it'd be great if we could avoid rosetta and have a native arm64 build. Or am I wrong?

@d3lm
Copy link
Author

d3lm commented Dec 3, 2021

I see if I can contribute and submit an action for building it for arm64.

@d3lm
Copy link
Author

d3lm commented Dec 3, 2021

I actually think that building for Apple Sillicon in a GitHub action is not really possible right now, because there's no environment that runs an M1. Tho, you could pass -DCMAKE_OSX_ARCHITECTURES=arm64 to cmake, but I am not sure if that would be enough. Do you have any experience with this?

@sbc100
Copy link
Member

sbc100 commented Dec 3, 2021

Because that requires rosetta and it'd be great if we could avoid rosetta and have a native arm64 build. Or am I wrong?

Why is avoiding rosetta an important goal here? Don't all M1 macs ship with rosetta builtin? I get that it would be nice to have.. but it doesn't seem particularly urgent or important?

Do you have an M1 mac that doesn't have rosetta installed?

Or are you worried about that overhead of running wasm-opt in emulation mode? (is your wasm-opt phase very slow?)

@sbc100
Copy link
Member

sbc100 commented Dec 3, 2021

Yes, if we wanted to have github build these binaries I imagine it would be cross compile.. i.e. produce an arm64 build on x86_64 hardware.

@d3lm
Copy link
Author

d3lm commented Dec 3, 2021

Don't all M1 macs ship with rosetta builtin?

I think so yes.

Do you have an M1 mac that doesn't have rosetta installed?

No, because I think it's installed by default.

Or are you worried about that overhead of running wasm-opt in emulation mode?

Yes, but that may be unjustified worries, because I have not yet seen any performance issues.

For the time being, I submitted a PR to wasm-pack rustwasm/wasm-pack#1088 that downloads the x86_64 build for aarch64. If issues pop up we can re-evaluate or try to set up a native arm64 build on x86_64 hardware.

@sbc100
Copy link
Member

sbc100 commented Dec 3, 2021

Great! That sounds a good place to be for now.

@kripken
Copy link
Member

kripken commented Dec 3, 2021

Another option here is to make a wasm build of wasm-opt. With node-pthreads support + exceptions support that should run pretty fast in recent node. Not as fast as a native M1 build I'm sure, but it might be faster than an x86_64 build emulated on M1... which makes this potentially interesting.

@d3lm
Copy link
Author

d3lm commented Dec 3, 2021

@kripken I actually had the same idea! That would work too which would eliminate the interoperability issues across different systems.

@kripken
Copy link
Member

kripken commented Dec 7, 2021

I did a little testing of that now using this diff:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index e540b1f57..0dc9c204b 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -265,26 +265,32 @@ else()
     add_nondebug_compile_flag("-UNDEBUG")
   endif()
 endif()
 
 if(EMSCRIPTEN)
   # link with -O3 for metadce and other powerful optimizations. note that we
   # must use add_link_options so that this appears after CMake's default -O2
   add_link_options("-O3")
   add_link_flag("-s SINGLE_FILE")
   add_link_flag("-s ALLOW_MEMORY_GROWTH=1")
-  add_compile_flag("-s DISABLE_EXCEPTION_CATCHING=0")
-  add_link_flag("-s DISABLE_EXCEPTION_CATCHING=0")
+  add_compile_flag("-fwasm-exceptions")
+  add_link_flag("-fwasm-exceptions")
+  add_compile_flag("-pthread")
+  add_link_flag("-pthread")
+  add_link_flag("-s PROXY_TO_PTHREAD")
+  add_link_flag("-s EXIT_RUNTIME")
+  add_link_flag("-Wno-pthreads-mem-growth")
   # make the tools immediately usable on Node.js
   add_link_flag("-s NODERAWFS")
   # in opt builds, LTO helps so much (>20%) it's worth slow compile times
-  add_nondebug_compile_flag("-flto")
+  #add_nondebug_compile_flag("-flto")
 endif()
 
 # clang doesn't print colored diagnostics when invoked from Ninja
 if(UNIX AND CMAKE_GENERATOR STREQUAL "Ninja")
   if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
     add_compile_flag("-fdiagnostics-color=always")
   elseif(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
     add_compile_flag("-fcolor-diagnostics")
   endif()
 endif()
diff --git a/src/support/threads.cpp b/src/support/threads.cpp
index ab9de4175..7aec720ba 100644
--- a/src/support/threads.cpp
+++ b/src/support/threads.cpp
@@ -132,24 +132,25 @@ void ThreadPool::initialize(size_t num) {
       threads.clear();
       return;
     }
   }
   DEBUG_POOL("initialize() waiting\n");
   condition.wait(lock, [this]() { return areThreadsReady(); });
   DEBUG_POOL("initialize() is done\n");
 }
 
 size_t ThreadPool::getNumCores() {
-#ifdef __EMSCRIPTEN__
+#if defined(__EMSCRIPTEN__) && !defined(__EMSCRIPTEN_PTHREADS__)
   return 1;
 #else
   size_t num = std::max(1U, std::thread::hardware_concurrency());
   if (getenv("BINARYEN_CORES")) {
     num = std::stoi(getenv("BINARYEN_CORES"));
   }
   return num;
 #endif
 }
 
 ThreadPool* ThreadPool::get() {
   DEBUG_POOL("::get()\n");
   // lock on the creation

Everything works as expected when I optimize some large files, but it is surprisingly slow. It is using multiple cores, and in fact uses them more than a native build. Native has this:

66.95user 0.67system 0:14.33elapsed 471%CPU (0avgtext+0avgdata 608316maxresident)k

and node 16.5 has this:

768.36user 237.38system 2:37.22elapsed 639%CPU (0avgtext+0avgdata 762028maxresident)k

I'm not sure what's going wrong here. Perhaps the slower wasm atomics are hurting us since they are all sequentially consistent...? Anyhow, sadly using wasm isn't a easy solution for this issue.

@d3lm
Copy link
Author

d3lm commented Dec 8, 2021

Thanks so much for investigating a WASM version of wasm-opt @kripken. Appreciate it a lot!

It's a bummer that it's so much slower 😞 but I can also imagine that Atomics are slowing it down a lot. I can't imagine that using pthreads (which is mapped to web workers or worker threads) would be a problem here.

I'd love to loop @tschneidereit in as well. Maybe you have an idea why it would be significantly slower than native?

I suppose we have to wait for an M1 environment for GitHub actions or build an aarch64 with a cross compiler. WDYT @kripken?

@dschuff
Copy link
Member

dschuff commented Dec 16, 2021

If we have a recent enough SDK on our existing mac builder, It should be very straightforward to cross-compile for aarch64.

@tlively
Copy link
Member

tlively commented May 16, 2025

We now have MacOS Arm releases.

@tlively tlively closed this as completed May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants