Skip to content

hygiene: Ensure uniqueness of SyntaxContextDatas #130324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 26, 2025

Conversation

petrochenkov
Copy link
Contributor

@petrochenkov petrochenkov commented Sep 13, 2024

SyntaxContextDatas are basically interned with SyntaxContexts working as indices, so they are supposed to be unique.
However, currently duplicate SyntaxContextDatas can be created during decoding from metadata or incremental cache.
This PR fixes that.

cc #129827 (comment)

@rustbot
Copy link
Collaborator

rustbot commented Sep 13, 2024

r? @TaKO8Ki

rustbot has assigned @TaKO8Ki.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 13, 2024
@petrochenkov
Copy link
Contributor Author

I've added many asserts, I'll change them to debug asserts if they affect performance.
@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 13, 2024
@petrochenkov petrochenkov removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 13, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 13, 2024
hygiene: Ensure uniqueness of `SyntaxContextData`s

`SyntaxContextData`s are basically interned with `SyntaxContext`s working as keys, so they are supposed to be unique.
However, currently duplicate `SyntaxContextData`s can be created during decoding from metadata or incremental cache.
This PR fixes that.

cc rust-lang#129827 (comment)
@bors
Copy link
Collaborator

bors commented Sep 13, 2024

⌛ Trying commit e577b7a with merge b517457...

@bors
Copy link
Collaborator

bors commented Sep 13, 2024

☀️ Try build successful - checks-actions
Build commit: b517457 (b51745778d3c14275d7b8f9115c2aa8e3b760bfb)

@rust-timer

This comment has been minimized.

@cjgillot cjgillot self-assigned this Sep 13, 2024
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (b517457): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 11
Regressions ❌
(secondary)
0.3% [0.1%, 1.1%] 38
Improvements ✅
(primary)
-0.3% [-0.4%, -0.2%] 10
Improvements ✅
(secondary)
-0.4% [-0.4%, -0.3%] 3
All ❌✅ (primary) 0.0% [-0.4%, 0.4%] 21

Max RSS (memory usage)

Results (secondary 1.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.7% [3.7%, 3.7%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.6% [-1.6%, -1.6%] 1
All ❌✅ (primary) - - 0

Cycles

Results (secondary -12.6%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-12.6% [-15.5%, -2.5%] 7
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.5%, secondary -0.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-2.0%, -0.0%] 63
Improvements ✅
(secondary)
-0.8% [-2.6%, -0.0%] 14
All ❌✅ (primary) -0.5% [-2.0%, -0.0%] 63

Bootstrap: 756.444s -> 757.208s (0.10%)
Artifact size: 341.13 MiB -> 341.19 MiB (0.02%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Sep 14, 2024
@cjgillot
Copy link
Contributor

I'm not super fond of the "hopefully" rhetoric...
SyntaxContexts form a tree structure, is there a way we could exploit it?
By refactoring all this into a DFS in metadata/cache to fetch the root, and then decode the children in the proper order?

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 14, 2024

SyntaxContexts form a tree structure

Right now they are not a tree because the opaque(_and_semitransparent) are caches that often refer to the context itself.
With #129827 SyntaxContextKeys will probably be a tree - they should be a tree but I'm not sure that proc macro or built-in macro logic cannot mess up something, need to verify it with a bunch of asserts too.
So the whole idea will be easier to implement after #129827.

@petrochenkov
Copy link
Contributor Author

petrochenkov commented Sep 14, 2024

FIXME: The holes left by decoder break the logic assigning $crate names in fn update_dollar_crate_names, it's not too important because it's just for pretty printing, but still better to fix it.

UPD: Fixed in the last commit.

@petrochenkov petrochenkov added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Sep 14, 2024
@petrochenkov
Copy link
Contributor Author

and then decode the children in the proper order?

Ah, there's one more thing - not all contexts are coming from the decoder (during incremental compilation at least).

Many contexts come from the freshly redone compilation (which is typically done before incremental decoding starts) and then they need to "unify" with equivalent contexts coming from decoding - that's where the duplicates were coming from before this PR.

So even if all decoding is done in proper order, you can still decode and get a context that is equivalent to one of the freshly built ones, but you don't know it until you decode it and compare.

Maybe if #129827 eliminates recursion we'll be able to avoid reserving SyntaxContexts in advance, then there will be no holes.

@alex-semenyuk alex-semenyuk added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 1, 2024
@petrochenkov
Copy link
Contributor Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 15, 2025
@bors
Copy link
Collaborator

bors commented Mar 15, 2025

⌛ Trying commit b62bbea with merge e923174...

@bors
Copy link
Collaborator

bors commented Mar 15, 2025

☀️ Try build successful - checks-actions
Build commit: 75fac58 (75fac58b76ca93227b93847bbc0adb431e92543f)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (75fac58): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
0.2% [0.2%, 0.2%] 1
Regressions ❌
(secondary)
0.2% [0.1%, 0.3%] 14
Improvements ✅
(primary)
-0.3% [-0.6%, -0.1%] 38
Improvements ✅
(secondary)
-0.5% [-0.5%, -0.5%] 3
All ❌✅ (primary) -0.3% [-0.6%, 0.2%] 39

Max RSS (memory usage)

Results (primary -1.6%, secondary -0.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.4% [2.2%, 2.7%] 2
Improvements ✅
(primary)
-1.6% [-2.1%, -1.1%] 3
Improvements ✅
(secondary)
-5.4% [-5.4%, -5.4%] 1
All ❌✅ (primary) -1.6% [-2.1%, -1.1%] 3

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.5%, secondary -0.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-2.0%, -0.0%] 68
Improvements ✅
(secondary)
-0.8% [-2.6%, -0.0%] 14
All ❌✅ (primary) -0.5% [-2.0%, -0.0%] 68

Bootstrap: 773.052s -> 772.368s (-0.09%)
Artifact size: 365.08 MiB -> 365.01 MiB (-0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Mar 15, 2025
@petrochenkov
Copy link
Contributor Author

The perf is better with debug asserts, it's going to be even better with #129827.
@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 15, 2025
@petrochenkov
Copy link
Contributor Author

SyntaxContexts form a tree structure

Right now they are not a tree because the opaque(_and_semitransparent) are caches that often refer to the context itself. With #129827 SyntaxContextKeys will probably be a tree - they should be a tree but I'm not sure that proc macro or built-in macro logic cannot mess up something, need to verify it with a bunch of asserts too. So the whole idea will be easier to implement after #129827.

I'll try to get rid of the holes in syntactic contexts after #129827 lands (I'll update it myself if @bvanjoi is unavailable).

@bvanjoi
Copy link
Contributor

bvanjoi commented Mar 16, 2025

I'll try to get rid of the holes in syntactic contexts after #129827 lands

Does this mean that #129827 should land before this PR? I'll update resolve git conflict if that's the case.

@petrochenkov
Copy link
Contributor Author

@bvanjoi
No, I mean we merge this one first, then #129827, and after that syntax context decoding won't have cycles.

@petrochenkov
Copy link
Contributor Author

r? compiler

@rustbot rustbot assigned oli-obk and unassigned cjgillot Mar 24, 2025
@oli-obk
Copy link
Contributor

oli-obk commented Mar 26, 2025

@bors r+

@bors
Copy link
Collaborator

bors commented Mar 26, 2025

📌 Commit 07328d5 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 26, 2025
@bors
Copy link
Collaborator

bors commented Mar 26, 2025

⌛ Testing commit 07328d5 with merge 19cab6b...

@bors
Copy link
Collaborator

bors commented Mar 26, 2025

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 19cab6b to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 26, 2025
@bors bors merged commit 19cab6b into rust-lang:master Mar 26, 2025
7 checks passed
@rustbot rustbot added this to the 1.87.0 milestone Mar 26, 2025
Copy link

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing f1bc669 (parent) -> 19cab6b (this PR)

Test differences

Show 8 test diffs

Additionally, 8 doctest diffs were found. These are ignored, as they are noisy.

Job group index

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (19cab6b): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.2% [0.1%, 0.2%] 2
Improvements ✅
(primary)
-0.4% [-0.7%, -0.2%] 30
Improvements ✅
(secondary)
-0.5% [-0.5%, -0.5%] 3
All ❌✅ (primary) -0.4% [-0.7%, -0.2%] 30

Max RSS (memory usage)

Results (primary -1.4%, secondary 0.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [1.0%, 1.0%] 1
Regressions ❌
(secondary)
2.0% [2.0%, 2.0%] 1
Improvements ✅
(primary)
-2.2% [-4.5%, -0.7%] 3
Improvements ✅
(secondary)
-1.8% [-1.8%, -1.8%] 1
All ❌✅ (primary) -1.4% [-4.5%, 1.0%] 4

Cycles

Results (secondary -0.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.5% [2.5%, 2.5%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.0% [-3.0%, -3.0%] 1
All ❌✅ (primary) - - 0

Binary size

Results (primary -0.5%, secondary -0.8%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.5% [-2.0%, -0.0%] 68
Improvements ✅
(secondary)
-0.8% [-2.6%, -0.0%] 14
All ❌✅ (primary) -0.5% [-2.0%, -0.0%] 68

Bootstrap: 777.678s -> 777.548s (-0.02%)
Artifact size: 365.82 MiB -> 365.81 MiB (-0.00%)

@panstromek
Copy link
Contributor

Perf triage:

Improvements outweigh regressions. Regression tt-muncher seems spurious (the benchmark got back to previous state in a following rollup). deep-vector looks like noise.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants