-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Status & Roadblocks for a portable Emscripten #11175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Doesn't emsdk serve the purpose of providing emscripten fully bundled with all its dependencies? Indeed that seems to be the sole purpose of emsdk. We could do more (for example, I'll looking an bundling python3 with the macOS version in order to avoid users having to install it themselves). Admittedly, emsdk it only supported 3 operating systems. Do you have specific other platforms you want to run the toolchain on? Is sounds like you are defining portable as "runs in JavaScript", is that right? (Not criticizing this definition, just trying to defined what we you asking for here). We also the the emscripten docker image which packages up even more dependencies and has even fewer system dependencies: https://hub.docker.com/r/trzeci/emscripten/. |
The portable idea, which I probably should clarify in the first post, would mean you could put it on a flashdrive and plug & play with most Linux distros, MacOS, and Windows 10. Prebuilt binaries would work, but the eventual goal would be to have emscripten be executable on the client side of a website. Node has prebuilt binaries, and could be an intermediary to reach web-JavaScript, but ideally all dependencies would eventually be compiled to Wasm. Emscripten, especially with docker, is perfectly easy to setup for most devs. My use case is needing WASM compilation in an app with a non-technical audience that wouldn't be comfortable installing, or even checking a version of python. Beyond that it would be desirable to have the global version of python be independent of the one used by emscripten. |
Emscripten SDK is designed to be portable for flashdrive plug&play use, but unfortunately not "all-in-one" for all OSes in one go. You could get around this by preparing a USB flash drive with different On Windows Emsdk depends operates by bundling python 3, because Windows does not offer a system python installation. On Linux, a system python 3 installation is expected, and on macOS, a a system python 2 installation is expected. |
@juj that general approach would work. How would be best to target different OS's and create those three folders? Is there a way to perform the install command beforehand, and target each OS from a single OS. If a standalone python executable was created, how might I go about directing emscripten to use it instead of looking for a global one? |
Emsdk itself does not have a cross-compilation architecture. However if you build manually, you would be able to cross-compile to each OS, as long as you set up cross-compiler archs accordingly.
The python installation that emsdk ships for Windows is a portable/standalone python installation. You can look at the emsdk_manifest.json files on how it solves that, and reuse the same scheme for other OSes. Instead of editing emsdk_manifest.json for those OSes, check how emsdk precreates the .emscripten config file and the env. vars for Windows, which locate the portable python installation. The same scheme can be used for the other OSes, it will work identically there. |
If you want to how we bundle the windows python in the emsdk I wrote this little script that generates it: https://github.com/emscripten-core/emsdk/blob/master/scripts/update_python.py |
I'd love to see this happen! LLVM and Binaryen are the easy parts. There are ports of both to js+wasm (using node.js file access), although the LLVM side is less polished, but it could be. A week or so ago I helped an LLVM-using project port itself to wasm, it took only about a day, although we did add a bunch of hacks along the way. If you're fine running node.js in Electron, that sounds fine. (I am personally also interested in a 100% Web solution, which means no Node, or limited functionality.) The bigger issue is Python. There are ports of it, and they work, but really just the pure computation side. Emscripten uses python to do things like file access, network access, multiprocessing, etc. Those things are harder to port. I see 2 main routes here:
I wish I had time for these myself, but I can help someone else get started and with any questions! |
@juj This is fantastic to hear, I had no idea emsdk was shipped with a portable python for windows.
Thanks for the info, this is definitely enough that I can get started with. And thanks @sbc100 that is also really helpful! |
@kripken your news is also great to hear. Do you have any links to the LLVM repo you worked on? I reached out to the author of the clang-wasm demo, but the repo was still very alpha (the iostream header doesn't even work). The python multiprocessing will certainly be very hard to port, I didn't realize that was part of the code base. I'll be exploring all the ways; portable python, auto-convert and fix bugs, or manual translation from python to node. For fun sometime I'll probably make a rough python-to-javascript syntax converter although it'll probably be more like 75% instead of 90% hands-free conversion. The multiprocessing code will be the real work in almost any case though. |
If I were you I would start with wask-sdk. Its just one or two standalone binaries. You will learn a lot in the process and, if you are successful in building a version that suites your needs then you might consider extending the project to emscripten which is strictly harder (by several orders of magnitude IMHO). |
It's not my project, and I'm not sure they want it shared widely yet, sorry. (But there isn't much there that would help you - basically, the port was just: run emcmake cmake and then make, and add a few hacks to avoid missing features.)
Btw, I did some searching meanwhile, and found JavaScripthon. It looks like a pretty serious effort at translating Python to JavaScript, so might be worth looking at! |
After getting more familiar with the codebase, I should partially apologize as I should've done a better job reading the readme. I can see why my first few comments on portability were confusing since I was misunderstanding the dependencies. You guys are really friendly despite it not making too much sense! (I meant to post this issue to emsdk instead of core) To my defense, I think there is a subtle issue on emscripten.org, which I was using instead of the readme. The Download and Install page thats linked by Google has this:
And following that link you reach:
Followed by
Following that "all the required tools" link ends up here:
Which seems like the normal requirements since there's explicit statements about building from source like "Git is required if building tools from source." But upon closer inspection I see now that the whole list is under the building from source page. Thats why I was really surprised @juj when you mentioned bundled python with windows (since emscripten.org said to install python ≥ 2.7.12 on Windows). |
So to revise the issue: 99% of my needs are met! (Even if it's still far away from a web implementation) I'm really impressed how you guys bundle everything, I can't remember the last time I've used a tool that didn't need global non-preinstalled dependencies. I've been getting started on the last 1%, which is to have a single folder corresponding to a specific version of emscripten where I'm still interested in reducing the dependencies as much as possible, but it will be probably be awhile before I put significant work into converting the whole code base to node or compile the LLVM itself to WASM/WASI. Running it with a portable linux python will probably be my next goal, since I'm working on building my own standalone tool on top of WASM that some might end up be run inside a docker container. |
@kripken thanks for sharing JavaScripthon, I didn't see that in my search. It looks really interesting, I might use it on other projects as well.
Is there any interest in slowly moving away from python? Either towards node, or towards something like Rust/C++ that can be compiled to WASM. |
Yes, I think that would be good to explore (I don't have a strong opinion between the various options, each has advantages). Would be great if someone experimented with this, something like getting "hello world" to work with an emcc.py replacement. |
That's good to hear. In that case I may rewrite some files when I'm working to understand them. I've coincidentally been working on something that should help; a Node recreation of a library for scripting tasks. It seems like a decent chunk of the python is simple tasks like zip/unzip, temp dirs, or converting between windows/unix filepaths. I found the mention of the "embedded" mode in the code and took advantage of that. I didn't see anything about the embedded mode in the documentation though. I fully built a The only issue on the Mac test, was that it rebuilds the cache on the first run even though the cache was already built. It would be really nice to not have to rebuild it since it takes a good 10min, but I'm not sure if that's worth forcing: is the cache dependent on OS-version/hardware architecture? I could probably force it to check the relative path, but it's not worth pursuing if it's going to result in a corrupt cache. |
If you install with "./emsdk --embedded" the resulting tree should be completely portable/movable. If you install emscripten with "emsdk" the cache should these days come fully populated. The cache should platform independent, although as it happens the emsdk builders for each OS build their own cache. I would be interesting to figure out why the case is being cleared. EMCC_DEBUG=1 might help you here. |
@sbc100 Thanks for the info. I'm using emsdk so I'll figure out what's going on with that debugging tip. Thanks also for the brief explanation of the embedded option. I (of course) found the documentation immediately after reading your comment, so I'll be spending a lot more time reading. |
Embedded is now the default mode for emsdk as of a couple of days ago! |
(did you mean now? Instead of not) |
Yes! :) |
whoop! 🎊 🚀 |
It would be nice to see emscripten as a complete .wasm file / a single executable, that can take C++ as input and output wasm, right now I have a desire to get emscripten working on the web but it appears that this may not be possible. I would use pkg as opposed to electron though. Electron is for porting the entire browser environment, pkg is for only nodejs apps. |
Wow! That is very impressive indeed. |
Thanks for mentioning that @jeff-hykin ! As mentioned in #6432 (comment) , I think we can close this now. See notes there about possible followup issues. |
What this issue is about
I hope for this to be an ongoing collaboration/discussion for creating some form of Emscripten with fully bundled dependencies. Portable meaning a flashdrive inserted into an fresh install of any major OS, with no internet connection, could compile C++ to wasm, and wouldn't break with changes to global versions of system packages. Possibly some other people, along with myself, would like to work on this, but we could use help getting started. (A related discussion was here: #9313 , however I wanted this issue to include portability outside of a browser context)
In theory, the basic job of Emscripten is a purely functional operation; input of one language output to another language. So (very much in-theory) it is possible for that operation to be done in an OS-independent portable way, and even possibly in a browser using WASM. And to be clear "portable" doesn't mean practical or primary, this isn't about removing dependencies from emscripten-master. A 1 hour compile time requiring 32Gb of RAM for a
hello world.cpp
using an out-of-date emscripten would still count as an initial success despite being extremely less than ideal.Work Overview
The main dependencies as I understand them are python, LLVM with clang & wasmId, node.js, binaryen, and the emscripten code itself.
Portability
Questions
Did I misunderstand any of the dependencies: Is cmake optional for Mac/Linux even though they're in the OS pre-reqs? Are LLVM and binaryen downloaded to the local emsdk folder at installation time?
What obvious & non-obvious challenges are there for having emscripten use portable versions of Python and LLVM?
Current rough plan
Create a
python
command that is actually using node.js and pyodide in order to be portable. Use decorators to catch python system calls and manually run them with the correct ENV variables and portable versions of the executables. Perform something similar for LLVM.The text was updated successfully, but these errors were encountered: