Reduce threading overhead

Our threading overhead seems significant. When I measure a fixed pure computational workload, replacing the body of a pass like `precompute` to instead just do some silly work, then measuring with `time`, the `user` time is the same when `BINARYEN_CORES=1` (use 1 core) and when running normally with all cores. That makes sense since the total actual work is added up in `user`, and it's the same. And there isn't much synchronization overhead that slows us down.

But that's not the typical case when running real passes, the `user` for multi-core can be much higher, see e.g. https://github.com/WebAssembly/binaryen/pull/2733#issuecomment-611246791 and I see similar things locally with user being 2-3 larger when using 8 threads.

This may be a large speedup opportunity. One possibility is that we often have many tiny functions, and maybe switching between them is costly? Or maybe there is contention on locks (see that last link, but this happens even after that PR which should get rid of that).

The thread-pool using code for running passes on functions is here: https://github.com/WebAssembly/binaryen/blob/dc5a503c4d54dc71ab46535c1966540785562dd7/src/passes/pass.cpp#L591

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce threading overhead #2740

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reduce threading overhead #2740

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions