-
Notifications
You must be signed in to change notification settings - Fork 786
Reduce threading overhead #2740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Some TODOs along this vein:
As multivalue becomes common, we will also want to:
|
Measuring with It does seem though that we gain almost nothing from using the reported system cores versus half of them. My guess is hyperthreading doesn't really help us since we are very CPU work bound (no I/O to wait on, and we are cache-friendly by having small data structures and running as many passes as possible on a single function before moving on to the next). But I'm not sure we can do anything about that. |
fwiw, still seeing huge overhead with the v105 release
|
Our threading overhead seems significant. When I measure a fixed pure computational workload, replacing the body of a pass like
precompute
to instead just do some silly work, then measuring withtime
, theuser
time is the same whenBINARYEN_CORES=1
(use 1 core) and when running normally with all cores. That makes sense since the total actual work is added up inuser
, and it's the same. And there isn't much synchronization overhead that slows us down.But that's not the typical case when running real passes, the
user
for multi-core can be much higher, see e.g. #2733 (comment) and I see similar things locally with user being 2-3 larger when using 8 threads.This may be a large speedup opportunity. One possibility is that we often have many tiny functions, and maybe switching between them is costly? Or maybe there is contention on locks (see that last link, but this happens even after that PR which should get rid of that).
The thread-pool using code for running passes on functions is here:
binaryen/src/passes/pass.cpp
Line 591 in dc5a503
The text was updated successfully, but these errors were encountered: