Skip to content

Add option to run compiler tests in parallel #873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 29, 2019
Merged

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Sep 27, 2019

Given the --parallel option, the compiler test suite can now spawn one process per logical CPU core and split the tests among them. The goal was to make this 100% interchangeable with a normal run, producing the same output except test ordering. Reduces the time of npm run test:compiler from 32 to 17 13 seconds on my machine using 16 7 workers.

@MaxGraey
Copy link
Member

MaxGraey commented Sep 27, 2019

I wonder how this different with worker_threads in performance? I think cluster/child process fork have bigger overhead but separate event loops which could be good for I/O

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 27, 2019

From what I can tell so far this can be speed up by reducing the number of workers from 16 to 8, matching the number of physical cores, at least on my setup (from 17 to 14 seconds) and by running the std/math and std/typedarray tests, which take the most time, first (from 14 to 12 seconds).

@MaxGraey
Copy link
Member

MaxGraey commented Sep 27, 2019

Try also numWorkers = physicalCores - 1.

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 27, 2019

Ah, right. That's 0.5 seconds faster than half the physical cores :)

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 27, 2019

Last commit brings it down to 13 seconds without reordering tests.

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 27, 2019

From a node profile and a CPU profile I learned that most time is spent in either I/O, compiling a JS file, running the compiler or inside of Binaryen, while all cores spike at 100% at first with some dropping earlier than others. That's most likely because the remaining few long-running tests need to finish while some cores are already idle. I also found that running a non-parallel test utilizes one core at 100% and four at about 40%, so some stuff inside of node/V8 is threaded even in this case, even though the process presents itself as single-threaded. Explains why it's not a NUM_CORES times speedup.

@MaxGraey
Copy link
Member

What if replace this to instantiateStreaming?

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 28, 2019

Does that work in node? If you want, feel free to improve this even more :)

@MaxGraey
Copy link
Member

wait) I'm totally forgot that node.js does't support instantiateStreaming=)

@MaxGraey
Copy link
Member

But we still have async WebAssembly.instantiate. I checked this PR on my machine and all my cpus loads only on ~60-70% so we still have gap for improvements. I guess it should be I/O.

@dcodeIO
Copy link
Member Author

dcodeIO commented Sep 29, 2019

Suggesting to merge this as-is so everyone can try it on their machines, potentially submitting additional improvements. It's not enabled on CI yet, so that should be fine. Wdyt?

@MaxGraey
Copy link
Member

MaxGraey commented Sep 29, 2019

sure. Let's do it)

@dcodeIO dcodeIO merged commit 024d9de into master Sep 29, 2019
@MaxGraey MaxGraey mentioned this pull request Sep 29, 2019
@dcodeIO dcodeIO deleted the parallel-tests branch November 8, 2019 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants