Split `execute_job` into `execute_job_incr` and `execute_job_non_incr` #109046

Zoxc · 2023-03-12T09:39:51Z

execute_job was a bit large, so this splits it in 2. Performance was neutral locally, but this may affect bootstrap times.

rustbot · 2023-03-12T09:39:57Z

r? @michaelwoerister

(rustbot has picked a reviewer for you, use r? to override)

cjgillot · 2023-03-12T11:10:26Z

compiler/rustc_query_system/src/query/plumbing.rs

-            }
+    let prof_timer = qcx.dep_context().profiler().query_provider();
+    let result = qcx.start_query(job_id, query.depth_limit(), None, || query.compute(qcx, key));
+    let dep_node_index = qcx.dep_context().dep_graph().next_virtual_depnode_index();


Could you encapsulate next_virtual_depnode_index? The goal is to make sure that it is only called when DepGraph::data is None.

For instance, making DepGraph::Data return some marker type DisabledDepGraph on which we can call that method?

I tried to make a marker type struct DisabledDepGraph<'a, K: DepKind>(&'a DepGraph<K>), but it doesn't seem like LLVM was able to optimize it away:

Benchmark Before After
Time Time %
🟣 clap:check 1.7102s 1.7175s 0.43%
🟣 hyper:check 0.2533s 0.2538s 0.23%
🟣 regex:check 0.9534s 0.9567s 0.35%
🟣 syn:check 1.5869s 1.5971s 0.64%
🟣 syntex_syntax:check 6.1097s 6.1317s 0.36%
Total 10.6134s 10.6568s 0.41%
Summary 1.0000s 1.0040s 0.40%

What about DisabledDepGraph(Lrc<AtomicU32>), and have the dep-graph hand out references to that struct?

That sounds very performance equivalent.

How about adding assert!(self.data.is_none()); to DepGraph::next_virtual_depnode_index()?

That seems to be cheaper:

Benchmark Before After
Time Time %
🟣 clap:check 1.7151s 1.7189s 0.22%
🟣 hyper:check 0.2516s 0.2525s 0.34%
🟣 regex:check 0.9532s 0.9538s 0.06%
🟣 syn:check 1.5406s 1.5400s -0.04%
🟣 syntex_syntax:check 5.9116s 5.9193s 0.13%
Total 10.3722s 10.3845s 0.12%
Summary 1.0000s 1.0014s 0.14%

There's a 0.01% code size increase so it probably doesn't get optimized away.

Zoxc · 2023-03-12T13:02:06Z

Could I get a perf run? I want to see if this affects bootstrap times before growing the PR larger.

cjgillot · 2023-03-12T13:43:25Z

@bors try @rust-timer queue

bors · 2023-03-12T13:43:34Z

⌛ Trying commit 5bdc711b0c18afd0867e296085eafd4437bfcb9c with merge 8202e830070034d6387c18efbcdec4084be63938...

bors · 2023-03-12T16:23:54Z

☀️ Try build successful - checks-actions
Build commit: 8202e830070034d6387c18efbcdec4084be63938 (8202e830070034d6387c18efbcdec4084be63938)

rust-timer · 2023-03-12T21:06:46Z

Finished benchmarking commit (8202e830070034d6387c18efbcdec4084be63938): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.5%	[-0.6%, -0.3%]	2
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	3.8%	[2.5%, 6.0%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-4.3%	[-4.3%, -4.3%]	1
All ❌✅ (primary)	-	-	0

Cycles

This benchmark run did not return any relevant results for this metric.

michaelwoerister · 2023-03-15T14:20:03Z

The changes look good to me.
r? @cjgillot (in case you'd like to further discuss #109046 (comment))

rustbot · 2023-03-15T14:20:05Z

Could not assign reviewer from: cjgillot.
User(s) cjgillot are either the PR author or are already assigned, and there are no other candidates.
Use r? to specify someone else to assign.

cjgillot · 2023-03-18T14:18:31Z

r=me with the assertion #109046 (comment) added.

michaelwoerister · 2023-03-20T09:43:40Z

@bors r=cjgillot,michaelwoerister

bors · 2023-03-20T09:43:42Z

📌 Commit c4bcac6 has been approved by cjgillot,michaelwoerister

It is now in the queue for this repository.

bors · 2023-03-20T23:53:12Z

⌛ Testing commit c4bcac6 with merge 822c10f...

bors · 2023-03-21T02:23:19Z

☀️ Test successful - checks-actions
Approved by: cjgillot,michaelwoerister
Pushing 822c10f to master...

rust-timer · 2023-03-21T03:40:46Z

Finished benchmarking commit (822c10f): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.3%	[-0.3%, -0.3%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-0.3%, -0.3%]	1

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.1%	[0.6%, 3.5%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.4%	[-3.5%, -0.7%]	4
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.3%	[-3.5%, 3.5%]	6

Cycles

This benchmark run did not return any relevant results for this metric.

Refactor `try_execute_query` This merges `JobOwner::try_start` into `try_execute_query`, removing `TryGetJob` in the processes. 3 new functions are extracted from `try_execute_query`: `execute_job`, `cycle_error` and `wait_for_query`. This makes the control flow a bit clearer and improves performance. Based on rust-lang#109046. <table><tr><td rowspan="2">Benchmark</td><td colspan="1">Before</th><td colspan="2">After</th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 clap:check</td><td align="right">1.7134s</td><td align="right">1.7061s</td><td align="right"> -0.43%</td></tr><tr><td>🟣 hyper:check</td><td align="right">0.2519s</td><td align="right">0.2510s</td><td align="right"> -0.35%</td></tr><tr><td>🟣 regex:check</td><td align="right">0.9517s</td><td align="right">0.9481s</td><td align="right"> -0.38%</td></tr><tr><td>🟣 syn:check</td><td align="right">1.5389s</td><td align="right">1.5338s</td><td align="right"> -0.33%</td></tr><tr><td>🟣 syntex_syntax:check</td><td align="right">5.9488s</td><td align="right">5.9258s</td><td align="right"> -0.39%</td></tr><tr><td>Total</td><td align="right">10.4048s</td><td align="right">10.3647s</td><td align="right"> -0.38%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9962s</td><td align="right"> -0.38%</td></tr></table> r? `@cjgillot`

Refactor `try_execute_query` This merges `JobOwner::try_start` into `try_execute_query`, removing `TryGetJob` in the processes. 3 new functions are extracted from `try_execute_query`: `execute_job`, `cycle_error` and `wait_for_query`. This makes the control flow a bit clearer and improves performance. Based on rust-lang/rust#109046. <table><tr><td rowspan="2">Benchmark</td><td colspan="1">Before</th><td colspan="2">After</th></tr><tr><td align="right">Time</td><td align="right">Time</td><td align="right">%</th></tr><tr><td>🟣 clap:check</td><td align="right">1.7134s</td><td align="right">1.7061s</td><td align="right"> -0.43%</td></tr><tr><td>🟣 hyper:check</td><td align="right">0.2519s</td><td align="right">0.2510s</td><td align="right"> -0.35%</td></tr><tr><td>🟣 regex:check</td><td align="right">0.9517s</td><td align="right">0.9481s</td><td align="right"> -0.38%</td></tr><tr><td>🟣 syn:check</td><td align="right">1.5389s</td><td align="right">1.5338s</td><td align="right"> -0.33%</td></tr><tr><td>🟣 syntex_syntax:check</td><td align="right">5.9488s</td><td align="right">5.9258s</td><td align="right"> -0.39%</td></tr><tr><td>Total</td><td align="right">10.4048s</td><td align="right">10.3647s</td><td align="right"> -0.38%</td></tr><tr><td>Summary</td><td align="right">1.0000s</td><td align="right">0.9962s</td><td align="right"> -0.38%</td></tr></table> r? `@cjgillot`

rustbot assigned michaelwoerister Mar 12, 2023

cjgillot reviewed Mar 12, 2023

View reviewed changes

cjgillot self-assigned this Mar 12, 2023

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 12, 2023

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 12, 2023

Zoxc mentioned this pull request Mar 13, 2023

Refactor try_execute_query #109100

Merged

cjgillot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 18, 2023

Zoxc added 2 commits March 19, 2023 17:39

Split execute_job into execute_job_incr and execute_job_non_incr

486a387

Add some assertions

c4bcac6

Zoxc force-pushed the split-execute-job branch from 5bdc711 to c4bcac6 Compare March 19, 2023 16:43

bors removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 20, 2023

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Mar 20, 2023

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 21, 2023

bors merged commit 822c10f into rust-lang:master Mar 21, 2023

rustbot added this to the 1.70.0 milestone Mar 21, 2023

Zoxc deleted the split-execute-job branch March 21, 2023 02:28

matthiaskrgr mentioned this pull request Feb 13, 2024

ICE: downgrade_to_delayed_bug: cannot downgrade Warning to DelayedBug: not an error #121006

Closed

Benchmark	Before	After
Benchmark	Time	Time	%
🟣 clap:check	1.7102s	1.7175s	0.43%
🟣 hyper:check	0.2533s	0.2538s	0.23%
🟣 regex:check	0.9534s	0.9567s	0.35%
🟣 syn:check	1.5869s	1.5971s	0.64%
🟣 syntex_syntax:check	6.1097s	6.1317s	0.36%
Total	10.6134s	10.6568s	0.41%
Summary	1.0000s	1.0040s	0.40%

Split execute_job into execute_job_incr and execute_job_non_incr #109046

Split execute_job into execute_job_incr and execute_job_non_incr #109046

Uh oh!

Conversation

Zoxc commented Mar 12, 2023

Uh oh!

rustbot commented Mar 12, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Zoxc commented Mar 12, 2023

Uh oh!

cjgillot commented Mar 12, 2023

Uh oh!

This comment has been minimized.

bors commented Mar 12, 2023

Uh oh!

bors commented Mar 12, 2023

Uh oh!

This comment has been minimized.

rust-timer commented Mar 12, 2023

Overall result: ✅ improvements - no action needed

Uh oh!

michaelwoerister commented Mar 15, 2023

Uh oh!

rustbot commented Mar 15, 2023

Uh oh!

cjgillot commented Mar 18, 2023

Uh oh!

michaelwoerister commented Mar 20, 2023

Uh oh!

bors commented Mar 20, 2023

Uh oh!

bors commented Mar 20, 2023

Uh oh!

bors commented Mar 21, 2023

Uh oh!

rust-timer commented Mar 21, 2023

Overall result: ✅ improvements - no action needed

Uh oh!

Uh oh!

Split `execute_job` into `execute_job_incr` and `execute_job_non_incr` #109046

Split `execute_job` into `execute_job_incr` and `execute_job_non_incr` #109046