From b5a73c44acea983cf3a046cf96582dacdca671ca Mon Sep 17 00:00:00 2001 From: Timothy Maloney Date: Mon, 6 Sep 2021 13:11:09 -0700 Subject: [PATCH 1/6] Docs: consolidated parallelism information --- src/overview.md | 7 +-- src/parallel-rustc.md | 61 ++++++++++++++----- .../query-evaluation-model-in-detail.md | 26 -------- 3 files changed, 47 insertions(+), 47 deletions(-) diff --git a/src/overview.md b/src/overview.md index 6c756e18f..4f6d0ffb9 100644 --- a/src/overview.md +++ b/src/overview.md @@ -291,12 +291,7 @@ Compiler performance is a problem that we would like to improve on (and are always working on). One aspect of that is parallelizing `rustc` itself. -Currently, there is only one part of rustc that is already parallel: codegen. -During monomorphization, the compiler will split up all the code to be -generated into smaller chunks called _codegen units_. These are then generated -by independent instances of LLVM. Since they are independent, we can run them -in parallel. At the end, the linker is run to combine all the codegen units -together into one binary. +Currently, there is only one part of rustc that is parallel by default: codegen. However, the rest of the compiler is still not yet parallel. There have been lots of efforts spent on this, but it is generally a hard problem. The current diff --git a/src/parallel-rustc.md b/src/parallel-rustc.md index eec8219a5..243dca98e 100644 --- a/src/parallel-rustc.md +++ b/src/parallel-rustc.md @@ -1,24 +1,54 @@ # Parallel Compilation -Most of the compiler is not parallel. This represents an opportunity for -improving compiler performance. +As of September 2021, The only stage of the compiler +that is already parallel is codegen. The nightly compiler implements query evaluation, +but there is a lot of correctness work that needs to be done. The lack of parallelism at other stages +also represents an opportunity for improving compiler performance. One can try out the current +parallel compiler work by enabling it in the `config.toml`. -As of July 2021, work on explicitly parallelizing the -compiler has stalled. There is a lot of design and correctness work that needs -to be done. +These next few sections describe where and how parallelism is currently used, +and the current status of making parallel compilation the default in `rustc`. + +The underlying thread-safe data-structures used in the parallel compiler +can be found in `rustc_data_structures/sync.rs`. Some of these data structures +use the `parking_lot` API. + +## Code Gen + +During [monomorphization][monomorphization] the compiler splits up all the code to +be generated into smaller chunks called _codegen units_. These are then generated by +independent instances of LLVM running in parallel. At the end, the linker +is run to combine all the codegen units together into one binary. + +## Query System -One can try out the current parallel compiler work by enabling it in the -`config.toml`. +The query model has some properties that make it actually feasible to evaluate +multiple queries in parallel without too much of an effort: -There are a few basic ideas in this effort: +- All data a query provider can access is accessed via the query context, so + the query context can take care of synchronizing access. +- Query results are required to be immutable so they can safely be used by + different threads concurrently. -- There are a lot of loops in the compiler that just iterate over all items in - a crate. These can possibly be parallelized. -- We can use (a custom fork of) [`rayon`] to run tasks in parallel. The custom - fork allows the execution of DAGs of tasks, not just trees. -- There are currently a lot of global data structures that need to be made - thread-safe. A key strategy here has been converting interior-mutable - data-structures (e.g. `Cell`) into their thread-safe siblings (e.g. `Mutex`). + +When a query `foo` is evaluated, the cache table for `foo` is locked. + +- If there already is a result, we can clone it, release the lock and + we are done. +- If there is no cache entry and no other active query invocation computing the + same result, we mark the key as being "in progress", release the lock and + start evaluating. +- If there *is* another query invocation for the same key in progress, we + release the lock, and just block the thread until the other invocation has + computed the result we are waiting for. This cannot deadlock because, as + mentioned before, query invocations form a DAG. Some thread will always make + progress. + +## Current Status + +As of July 2021, work on explicitly parallelizing the +compiler has stalled. There is a lot of design and correctness work that needs +to be done. [`rayon`]: https://crates.io/crates/rayon @@ -45,3 +75,4 @@ are a bit out of date): [imlist]: https://github.com/nikomatsakis/rustc-parallelization/blob/master/interior-mutability-list.md [irlo1]: https://internals.rust-lang.org/t/help-test-parallel-rustc/11503 [tracking]: https://github.com/rust-lang/rust/issues/48685 +[monomorphization]:https://rustc-dev-guide.rust-lang.org/backend/monomorph.html diff --git a/src/queries/query-evaluation-model-in-detail.md b/src/queries/query-evaluation-model-in-detail.md index 4c2427e3c..b84a5dac4 100644 --- a/src/queries/query-evaluation-model-in-detail.md +++ b/src/queries/query-evaluation-model-in-detail.md @@ -211,29 +211,3 @@ much of a maintenance burden. To summarize: "Steal queries" break some of the rules in a controlled way. There are checks in place that make sure that nothing can go silently wrong. - - -## Parallel Query Execution - -The query model has some properties that make it actually feasible to evaluate -multiple queries in parallel without too much of an effort: - -- All data a query provider can access is accessed via the query context, so - the query context can take care of synchronizing access. -- Query results are required to be immutable so they can safely be used by - different threads concurrently. - -The nightly compiler already implements parallel query evaluation as follows: - -When a query `foo` is evaluated, the cache table for `foo` is locked. - -- If there already is a result, we can clone it, release the lock and - we are done. -- If there is no cache entry and no other active query invocation computing the - same result, we mark the key as being "in progress", release the lock and - start evaluating. -- If there *is* another query invocation for the same key in progress, we - release the lock, and just block the thread until the other invocation has - computed the result we are waiting for. This cannot deadlock because, as - mentioned before, query invocations form a DAG. Some thread will always make - progress. From e5cf0e6b63aced03a9df5e55b9522b94b300370f Mon Sep 17 00:00:00 2001 From: Timothy Maloney Date: Mon, 6 Sep 2021 13:18:17 -0700 Subject: [PATCH 2/6] Docs: delete redundant use of correctness --- src/parallel-rustc.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/parallel-rustc.md b/src/parallel-rustc.md index 243dca98e..7f20078b9 100644 --- a/src/parallel-rustc.md +++ b/src/parallel-rustc.md @@ -2,7 +2,7 @@ As of September 2021, The only stage of the compiler that is already parallel is codegen. The nightly compiler implements query evaluation, -but there is a lot of correctness work that needs to be done. The lack of parallelism at other stages +but there is still a lot of work to be done. The lack of parallelism at other stages also represents an opportunity for improving compiler performance. One can try out the current parallel compiler work by enabling it in the `config.toml`. From 8027e53780592962d5314884a50818f9862bf402 Mon Sep 17 00:00:00 2001 From: Timothy Maloney Date: Mon, 6 Sep 2021 13:42:56 -0700 Subject: [PATCH 3/6] Docs: added section discussing core ideas --- src/parallel-rustc.md | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/parallel-rustc.md b/src/parallel-rustc.md index 7f20078b9..a29e4f974 100644 --- a/src/parallel-rustc.md +++ b/src/parallel-rustc.md @@ -48,7 +48,22 @@ When a query `foo` is evaluated, the cache table for `foo` is locked. As of July 2021, work on explicitly parallelizing the compiler has stalled. There is a lot of design and correctness work that needs -to be done. +to be done. + +These are the basic ideas in the effort to make `rustc` parallel: + +- All data a query provider can access is accessed via the query context, so + the query context can take care of synchronizing access. +- Query results are required to be immutable so they can safely be used by + different threads concurrently. + +- There are a lot of loops in the compiler that just iterate over all items in + a crate. These can possibly be parallelized. +- We can use (a custom fork of) [`rayon`] to run tasks in parallel. The custom + fork allows the execution of DAGs of tasks, not just trees. +- There are currently a lot of global data structures that need to be made + thread-safe. A key strategy here has been converting interior-mutable + data-structures (e.g. `Cell`) into their thread-safe siblings (e.g. `Mutex`). [`rayon`]: https://crates.io/crates/rayon From c49e07eb07f6729a80866549b295940d9c20894a Mon Sep 17 00:00:00 2001 From: Timothy Maloney Date: Mon, 6 Sep 2021 13:45:44 -0700 Subject: [PATCH 4/6] Docs: deleted copy --- src/parallel-rustc.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/src/parallel-rustc.md b/src/parallel-rustc.md index a29e4f974..67f349ac3 100644 --- a/src/parallel-rustc.md +++ b/src/parallel-rustc.md @@ -30,7 +30,6 @@ multiple queries in parallel without too much of an effort: - Query results are required to be immutable so they can safely be used by different threads concurrently. - When a query `foo` is evaluated, the cache table for `foo` is locked. - If there already is a result, we can clone it, release the lock and @@ -52,11 +51,6 @@ to be done. These are the basic ideas in the effort to make `rustc` parallel: -- All data a query provider can access is accessed via the query context, so - the query context can take care of synchronizing access. -- Query results are required to be immutable so they can safely be used by - different threads concurrently. - - There are a lot of loops in the compiler that just iterate over all items in a crate. These can possibly be parallelized. - We can use (a custom fork of) [`rayon`] to run tasks in parallel. The custom From cfb5586026357201d583678702c91224262f382c Mon Sep 17 00:00:00 2001 From: Timothy Maloney Date: Mon, 6 Sep 2021 16:08:07 -0700 Subject: [PATCH 5/6] Docs: made suggested fix --- src/parallel-rustc.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/parallel-rustc.md b/src/parallel-rustc.md index 67f349ac3..d1eff7ad7 100644 --- a/src/parallel-rustc.md +++ b/src/parallel-rustc.md @@ -13,7 +13,7 @@ The underlying thread-safe data-structures used in the parallel compiler can be found in `rustc_data_structures/sync.rs`. Some of these data structures use the `parking_lot` API. -## Code Gen +## Codegen During [monomorphization][monomorphization] the compiler splits up all the code to be generated into smaller chunks called _codegen units_. These are then generated by From 6065f20c0b8e5678f54af8cdfe956e25357fcb89 Mon Sep 17 00:00:00 2001 From: Timothy Maloney Date: Tue, 7 Sep 2021 08:52:52 -0700 Subject: [PATCH 6/6] Docs: added section on rustdoc --- src/parallel-rustc.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/parallel-rustc.md b/src/parallel-rustc.md index d1eff7ad7..38230377b 100644 --- a/src/parallel-rustc.md +++ b/src/parallel-rustc.md @@ -43,6 +43,12 @@ When a query `foo` is evaluated, the cache table for `foo` is locked. mentioned before, query invocations form a DAG. Some thread will always make progress. +## Rustdoc + +As of September 2021, there are still a number of steps +to complete before rustdoc rendering can be made parallel. More details on +this issue can be found [here][parallel-rustdoc]. + ## Current Status As of July 2021, work on explicitly parallelizing the @@ -85,3 +91,4 @@ are a bit out of date): [irlo1]: https://internals.rust-lang.org/t/help-test-parallel-rustc/11503 [tracking]: https://github.com/rust-lang/rust/issues/48685 [monomorphization]:https://rustc-dev-guide.rust-lang.org/backend/monomorph.html +[parallel-rustdoc]:https://github.com/rust-lang/rust/issues/82741