Skip to content

Experiment with using HashMap::with_capacity throughout the compiler #137005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jyn514 opened this issue Feb 14, 2025 · 8 comments
Open

Experiment with using HashMap::with_capacity throughout the compiler #137005

jyn514 opened this issue Feb 14, 2025 · 8 comments
Assignees
Labels
C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jyn514
Copy link
Member

jyn514 commented Feb 14, 2025

We have found several times that rustc's performance is basically a giant benchmark of our hashmaps. @orlp notes:

Resizing the hash table during building basically means all the work you did up until that point was to calculate a single boolean value: your capacity wasn't enough, and nothing else of value was done .

If we can avoid resizing our hashmaps, that will avoid quite a lot of needless hashing. Given that so much of our benchmarks are made up of hashing, that could have a significant overall performance improvement. @orlp wrote this helpful benchmark giving an idea of the order of magnitude improvement we're talking about:

fn main() {
    let start = std::time::Instant::now();
    for _ in 0..100 {
        let mut hs = HashSet::new();
        for x in 0..1_000_000 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("no capacity: {:?}", start.elapsed());

    let start = std::time::Instant::now();
    for _ in 0..100 {
        let mut hs = HashSet::with_capacity(1_000_000);
        for x in 0..1_000_000 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("perfect capacity: {:?}", start.elapsed());

    let start = std::time::Instant::now();
    for _ in 0..100 {
        let mut hs = HashSet::with_capacity(1_000_000);
        for x in 0..hs.capacity() + 1 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("worst-case underestimation: {:?}", start.elapsed());

    let start = std::time::Instant::now();
    for _ in 0..100 {
        let hasher = FixedState::with_seed(42);
        let mut sketch = CardinalitySketch::new();
        for x in 0..1_000_000 {
            sketch.insert(hasher.hash_one(x));
        }

        let mut hs = HashSet::with_capacity(sketch.estimate() * 5 / 4);
        for x in 0..1_000_000 {
            hs.insert(x);
        }
        std::hint::black_box(hs);
    }
    println!("cardinality estimate: {:?}", start.elapsed());
}
no capacity: 1.582721584s
perfect capacity: 337.567208ms
worst-case underestimation: 2.038280209s
cardinality estimate: 412.6665ms

There are two main parts to this issue.

  1. Determine whether making changes here will be an improvement (I feel fairly confident about this, but it's nice to have data). This could be done by either:
    • Looking at a performance sample to see which functions are hot, and seeing how much time resizing functions are taking up (i think the relevant function is hashbrown::raw::RawTableInner::resize_inner). This will also let us know which maps are most in need of pre-allocating (which will be helpful in part 2).
    • Using a very liberal estimate of how much capacity we need (say, 10x the number of items in the crate for each hashmap) and doing a perf run on that to see if we get any improvements.
  2. Determining an upper bound for each hashmap so we don't allocate more memory than we need. I expect this to be the hard part.

(and then, of course, actually switching to using with_capacity)

@rustbot label T-compiler I-slow E-medium E-mentor

@rustbot rustbot added needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 14, 2025
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Feb 14, 2025
@rustbot rustbot added E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. labels Feb 14, 2025
@steffahn
Copy link
Member

steffahn commented Feb 14, 2025

Do note that parts of this test case are flawed. 0..hs.capacity() + 1 is way larger than 1_000_000. (It’s 1_835_009.)


Also, I had played around with these test cases – and variations of them – for a few hours when they were first shared, and noticed that some performance could already be gained by skipping a few steps resize steps. By doing some capacity == len checks before an insert and reserving 2 * capacity + 1 extra space, we'd do x4 resizes instead of x2. In a similar way, one can do x8 as well.

I also tested some code where hash values where cached, but that ended up being less clearly beneficial, even with the expensive default hasher, at least in this test case. It may very well be the case that it's more effective for cases where the maps contains any more complex types1 than simple word- or pointer-sized ones, where hashing could involve touching more data and/or following some indirection.

Footnotes

  1. here, complex types behind a simple interned pointer don't count if they only hash the pointer ~ but e.g. the maps that do the interning do need to calculate real hashes that are (at least slightly) more nontrivial

@saethlin saethlin added I-compiletime Issue: Problems and improvements with respect to compile times. and removed I-slow Issue: Problems and improvements with respect to performance of generated code. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 14, 2025
@ARandomDev99
Copy link
Contributor

@rustbot claim

I'm interested in working on this. I've made some contributions to rust-lang/rust-clippy before. Is that experience sufficient to work with a mentor or should I be looking for even easier issues?

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 15, 2025
Reserve `HashSet` capacity before inserting cfgs/check-cfgs

This PR tries to reserve capacity before inserting cfgs/check-cfgs into their hashset.

Related to rust-lang#137005, mostly an experiment, but even if perf is neutral it still a good practice.
@Kobzol
Copy link
Contributor

Kobzol commented Feb 15, 2025

I think that analysing the hashmap usage should be doable. You can build the compiler with debuginfo (rust.debuginfo-level = "line-tables-only") and then profile the compiler e.g. using perf or Cachegrind (https://github.com/rust-lang/rustc-perf/tree/master/collector#profiling) to find the occurrences of the resize_inner function.

@jyn514
Copy link
Member Author

jyn514 commented Feb 15, 2025

you could also get a pre-made cachegrind file for any given perf run; see rust-lang/rustc-dev-guide#1692. but yes ideally you could create it locally, that lets you verify sooner that your changes have an impact.

note that we build with PGO and LTO in CI, so your numbers will not match perf.rlo exactly. but as long as you have a relative improvement that's still encouraging.

@the8472
Copy link
Member

the8472 commented Feb 15, 2025

Back in #90743 I did try initializing the types and predicates interning tables with a higher initial capacity and also more aggressive resizing, neither approach looked good on perf.

@nnethercote
Copy link
Contributor

throughout the compiler
...
If we can avoid resizing our hashmaps

I think this needs some care. I suspect there will be an extremely skewed distribution, e.g. 99.9% of the potential benefit would come from 0.1% of the hashmaps.

This PR has already inspired #137069, which was about pre-sizing a hashmap with a few dozen entries in it. That is not a good use of anyone's time.

So for anyone planning to look into this, please do some measurements first to determine which hashmap are reliably the biggest, and don't waste time on anything smaller. I suspect hashmaps with hundreds of thousands or millions of entries would be good candidates.

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 21, 2025
[perf experiment] Changed interners to start preallocated with an increased capacity

Inspired by rust-lang#137005.

*Not meant to be merged in its current form*

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by 1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones:
![image](https://github.com/user-attachments/assets/4a7f3317-7e61-4b28-a651-cc79ee990689)

The current default capacities are choosen somewhat arbitrarily, and are relatively low.

Depending on what kind of memory usage is acceptable, it may be beneficial to increase that capacity for some interners.

From a second local perf run(with capacity of `_type` increased to `131072`), it looks like increasing the size of the preallocated type interner has the biggest impact:
![image](https://github.com/user-attachments/assets/08ac324a-b03c-4fe9-b779-4dd35e7970d9)

What would be the maximum acceptable memory usage increase? I think most people would not mind sacrificing 1-2MB  for an improvement in compile speed, but I am curious what is the general opinion here.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 21, 2025
[perf experiment] Changed interners to start preallocated with an increased capacity

Inspired by rust-lang#137005.

*Not meant to be merged in its current form*

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by 1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones:
![image](https://github.com/user-attachments/assets/4a7f3317-7e61-4b28-a651-cc79ee990689)

The current default capacities are choosen somewhat arbitrarily, and are relatively low.

Depending on what kind of memory usage is acceptable, it may be beneficial to increase that capacity for some interners.

From a second local perf run(with capacity of `_type` increased to `131072`), it looks like increasing the size of the preallocated type interner has the biggest impact:
![image](https://github.com/user-attachments/assets/08ac324a-b03c-4fe9-b779-4dd35e7970d9)

What would be the maximum acceptable memory usage increase? I think most people would not mind sacrificing 1-2MB  for an improvement in compile speed, but I am curious what is the general opinion here.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 22, 2025
[perf experiment] Changed interners to start preallocated with an increased capacity

Inspired by rust-lang#137005.

*Not meant to be merged in its current form*

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by 1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones:
![image](https://github.com/user-attachments/assets/4a7f3317-7e61-4b28-a651-cc79ee990689)

The current default capacities are choosen somewhat arbitrarily, and are relatively low.

Depending on what kind of memory usage is acceptable, it may be beneficial to increase that capacity for some interners.

From a second local perf run(with capacity of `_type` increased to `131072`), it looks like increasing the size of the preallocated type interner has the biggest impact:
![image](https://github.com/user-attachments/assets/08ac324a-b03c-4fe9-b779-4dd35e7970d9)

What would be the maximum acceptable memory usage increase? I think most people would not mind sacrificing 1-2MB  for an improvement in compile speed, but I am curious what is the general opinion here.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 26, 2025
…lFir

Change interners to start preallocated with an increased capacity

Inspired by rust-lang#137005.

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by ~1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones.
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this issue Feb 27, 2025
Change interners to start preallocated with an increased capacity

Inspired by rust-lang/rust#137005.

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by ~1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones.
@FractalFir
Copy link
Contributor

FractalFir commented Feb 27, 2025

I have been looking into where most resizes of HashMaps occur(when building and optimized build of cargo), and I have a list of the most common sources(made using counts).

10139825 counts (weighted integral, erased)
(  1)  2073458 (20.4%, 20.4%): Resize Location { file: "compiler/rustc_query_impl/src/plumbing.rs", line: 390, col: 38 }, len:NNN
(  2)  1995425 (19.7%, 40.1%): Resize Location { file: "compiler/rustc_expand/src/placeholders.rs", line: 202, col: 33 }, len:NNN
(  3)  1431002 (14.1%, 54.2%): Resize Location { file: "/home/michal/rust/compiler/rustc_query_system/src/query/caches.rs", line: 70, col: 14 }, len:NNN
(  4)   596775 ( 5.9%, 60.1%): Resize Location { file: "compiler/rustc_ast_lowering/src/lib.rs", line: 645, col: 48 }, len:NNN
(  5)   458766 ( 4.5%, 64.7%): Resize Location { file: "compiler/rustc_resolve/src/lib.rs", line: 2064, col: 54 }, len:NNN
(  6)   370440 ( 3.7%, 68.3%): Resize Location { file: "compiler/rustc_hir_typeck/src/fn_ctxt/_impl.rs", line: 150, col: 37 }, len:NNN
(  7)   370440 ( 3.7%, 72.0%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 143, col: 46 }, len:NNN
(  8)   340493 ( 3.4%, 75.3%): Resize Location { file: "/home/michal/rust/compiler/rustc_query_system/src/cache.rs", line: 40, col: 35 }, len:NNN
(  9)   279125 ( 2.8%, 78.1%): Resize Location { file: "compiler/rustc_infer/src/infer/relate/generalize.rs", line: 563, col: 20 }, len:NNN
( 10)   266116 ( 2.6%, 80.7%): Resize Location { file: "/home/michal/rust/compiler/rustc_middle/src/ty/impls_ty.rs", line: 38, col: 32 }, len:NNN
( 11)   192650 ( 1.9%, 82.6%): Resize Location { file: "compiler/rustc_middle/src/ty/impls_ty.rs", line: 38, col: 32 }, len:NNN
( 12)   157776 ( 1.6%, 84.1%): Resize Location { file: "/home/michal/rust/compiler/rustc_data_structures/src/sso/set.rs", line: 134, col: 18 }, len:NNN
( 13)   114700 ( 1.1%, 85.3%): Resize Location { file: "compiler/rustc_middle/src/mir/interpret/mod.rs", line: 533, col: 62 }, len:NNN
( 14)   104549 ( 1.0%, 86.3%): Resize Location { file: "compiler/rustc_codegen_llvm/src/mono_item.rs", line: 86, col: 37 }, len:NNN
( 15)   104425 ( 1.0%, 87.3%): Resize Location { file: "compiler/rustc_codegen_llvm/src/type_of.rs", line: 213, col: 44 }, len:NNN
( 16)    86941 ( 0.9%, 88.2%): Resize Location { file: "/home/michal/rust/compiler/rustc_middle/src/ty/codec.rs", line: 107, col: 24 }, len:NNN
( 17)    62710 ( 0.6%, 88.8%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 629, col: 49 }, len:NNN
( 18)    60568 ( 0.6%, 89.4%): Resize Location { file: "compiler/rustc_hir_typeck/src/fn_ctxt/_impl.rs", line: 199, col: 62 }, len:NNN
( 19)    57355 ( 0.6%, 90.0%): Resize Location { file: "compiler/rustc_monomorphize/src/collector.rs", line: 279, col: 31 }, len:NNN
( 20)    57305 ( 0.6%, 90.5%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 644, col: 55 }, len:NNN
( 21)    46268 ( 0.5%, 91.0%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 612, col: 59 }, len:NNN
( 22)    45277 ( 0.4%, 91.4%): Resize Location { file: "compiler/rustc_hir_typeck/src/fn_ctxt/_impl.rs", line: 180, col: 68 }, len:NNN
( 23)    38488 ( 0.4%, 91.8%): Resize Location { file: "compiler/rustc_infer/src/traits/project.rs", line: 163, col: 13 }, len:NNN
( 24)    36243 ( 0.4%, 92.2%): Resize Location { file: "compiler/rustc_codegen_llvm/src/callee.rs", line: 164, col: 31 }, len:NNN
( 25)    32565 ( 0.3%, 92.5%): Resize Location { file: "compiler/rustc_hir_typeck/src/place_op.rs", line: 327, col: 68 }, len:NNN
( 26)    31397 ( 0.3%, 92.8%): Resize Location { file: "compiler/rustc_codegen_llvm/src/consts.rs", line: 268, col: 41 }, len:NNN
( 27)    31010 ( 0.3%, 93.1%): Resize Location { file: "compiler/rustc_hir_typeck/src/pat.rs", line: 896, col: 66 }, len:NNN
( 28)    31010 ( 0.3%, 93.4%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 322, col: 65 }, len:NNN
( 29)    29683 ( 0.3%, 93.7%): Resize Location { file: "compiler/rustc_ast_lowering/src/lib.rs", line: 639, col: 28 }, len:NNN
( 30)    28682 ( 0.3%, 94.0%): Resize Location { file: "compiler/rustc_metadata/src/rmeta/decoder.rs", line: 413, col: 36 }, len:NNN
( 31)    28682 ( 0.3%, 94.3%): Resize Location { file: "compiler/rustc_middle/src/query/on_disk_cache.rs", line: 316, col: 37 }, len:NNN
( 32)    28682 ( 0.3%, 94.6%): Resize Location { file: "compiler/rustc_resolve/src/lib.rs", line: 2071, col: 27 }, len:NNN
( 33)    28682 ( 0.3%, 94.9%): Resize Location { file: "compiler/rustc_resolve/src/lib.rs", line: 2083, col: 63 }, len:NNN
( 34)    28681 ( 0.3%, 95.1%): Resize Location { file: "compiler/rustc_resolve/src/lib.rs", line: 1377, col: 36 }, len:NNN
( 35)    28605 ( 0.3%, 95.4%): Resize Location { file: "compiler/rustc_infer/src/traits/project.rs", line: 179, col: 17 }, len:NNN
( 36)    25449 ( 0.3%, 95.7%): Resize Location { file: "compiler/rustc_trait_selection/src/traits/specialize/specialization_graph.rs", line: 351, col: 24 }, len:NNN
( 37)    24401 ( 0.2%, 95.9%): Resize Location { file: "compiler/rustc_mir_build/src/builder/matches/mod.rs", line: 2824, col: 26 }, len:NNN
( 38)    24330 ( 0.2%, 96.1%): Resize Location { file: "compiler/rustc_ast_lowering/src/pat.rs", line: 294, col: 62 }, len:NNN
( 39)    21638 ( 0.2%, 96.4%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 2198, col: 58 }, len:NNN
( 40)    17757 ( 0.2%, 96.5%): Resize Location { file: "compiler/rustc_hir_typeck/src/fn_ctxt/_impl.rs", line: 171, col: 62 }, len:NNN
( 41)    17757 ( 0.2%, 96.7%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 602, col: 53 }, len:NNN
( 42)    16891 ( 0.2%, 96.9%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 3929, col: 54 }, len:NNN
( 43)    14885 ( 0.1%, 97.0%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 4850, col: 34 }, len:NNN
( 44)    14345 ( 0.1%, 97.2%): Resize Location { file: "compiler/rustc_middle/src/query/on_disk_cache.rs", line: 322, col: 35 }, len:NNN
( 45)    14345 ( 0.1%, 97.3%): Resize Location { file: "compiler/rustc_resolve/src/macros.rs", line: 187, col: 40 }, len:NNN
( 46)    14344 ( 0.1%, 97.4%): Resize Location { file: "compiler/rustc_resolve/src/build_reduced_graph.rs", line: 1167, col: 64 }, len:NNN
( 47)    14344 ( 0.1%, 97.6%): Resize Location { file: "compiler/rustc_resolve/src/def_collector.rs", line: 98, col: 59 }, len:NNN
( 48)    14337 ( 0.1%, 97.7%): Resize Location { file: "/home/michal/rust/compiler/rustc_span/src/hygiene.rs", line: 208, col: 53 }, len:NNN
( 49)    13444 ( 0.1%, 97.9%): Resize Location { file: "/home/michal/rust/compiler/rustc_span/src/hygiene.rs", line: 234, col: 53 }, len:NNN
( 50)    12629 ( 0.1%, 98.0%): Resize Location { file: "compiler/rustc_infer/src/traits/project.rs", line: 221, col: 32 }, len:NNN
( 51)    12336 ( 0.1%, 98.1%): Resize Location { file: "compiler/rustc_mir_transform/src/validate.rs", line: 187, col: 36 }, len:NNN
( 52)    11253 ( 0.1%, 98.2%): Resize Location { file: "compiler/rustc_monomorphize/src/collector.rs", line: 569, col: 26 }, len:NNN
( 53)    10735 ( 0.1%, 98.3%): Resize Location { file: "compiler/rustc_monomorphize/src/collector.rs", line: 616, col: 22 }, len:NNN
( 54)    10358 ( 0.1%, 98.4%): Resize Location { file: "/home/michal/rust/compiler/rustc_mir_dataflow/src/value_analysis.rs", line: 54, col: 22 }, len:NNN
( 55)     9522 ( 0.1%, 98.5%): Resize Location { file: "compiler/rustc_codegen_llvm/src/type_of.rs", line: 246, col: 39 }, len:NNN
( 56)     9320 ( 0.1%, 98.6%): Resize Location { file: "compiler/rustc_ast_lowering/src/lib.rs", line: 563, col: 49 }, len:NNN
( 57)     7861 ( 0.1%, 98.7%): Resize Location { file: "/home/michal/rust/compiler/rustc_type_ir/src/data_structures/delayed_map.rs", line: 38, col: 20 }, len:NNN
( 58)     7666 ( 0.1%, 98.8%): Resize Location { file: "compiler/rustc_hir_typeck/src/fn_ctxt/_impl.rs", line: 242, col: 18 }, len:NNN
( 59)     7176 ( 0.1%, 98.8%): Resize Location { file: "compiler/rustc_resolve/src/lib.rs", line: 1364, col: 36 }, len:NNN
( 60)     7176 ( 0.1%, 98.9%): Resize Location { file: "compiler/rustc_span/src/hygiene.rs", line: 1308, col: 56 }, len:NNN
( 61)     7176 ( 0.1%, 99.0%): Resize Location { file: "compiler/rustc_span/src/hygiene.rs", line: 1309, col: 58 }, len:NNN
( 62)     7169 ( 0.1%, 99.1%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 4268, col: 38 }, len:NNN
( 63)     7044 ( 0.1%, 99.1%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 2141, col: 58 }, len:NNN
( 64)     6628 ( 0.1%, 99.2%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 4855, col: 34 }, len:NNN
( 65)     5335 ( 0.1%, 99.2%): Resize Location { file: "compiler/rustc_mir_build/src/builder/mod.rs", line: 982, col: 38 }, len:NNN
( 66)     4938 ( 0.0%, 99.3%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 3145, col: 32 }, len:NNN
( 67)     4336 ( 0.0%, 99.3%): Resize Location { file: "compiler/rustc_hir_typeck/src/check.rs", line: 114, col: 61 }, len:NNN
( 68)     4336 ( 0.0%, 99.4%): Resize Location { file: "compiler/rustc_hir_typeck/src/writeback.rs", line: 701, col: 57 }, len:NNN
( 69)     3591 ( 0.0%, 99.4%): Resize Location { file: "compiler/rustc_middle/src/mir/interpret/mod.rs", line: 463, col: 15 }, len:NNN
( 70)     3591 ( 0.0%, 99.4%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 5040, col: 35 }, len:NNN
( 71)     3591 ( 0.0%, 99.5%): Resize Location { file: "compiler/rustc_resolve/src/late.rs", line: 5066, col: 52 }, len:NNN
( 72)     3590 ( 0.0%, 99.5%): Resize Location { file: "/home/michal/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/fluent-bundle-0.15.3/src/bundle.rs", line: 315, col: 26 }, len:NNN
( 73)     3116 ( 0.0%, 99.5%): Resize Location { file: "compiler/rustc_borrowck/src/type_check/relate_tys.rs", line: 221, col: 29 }, len:NNN
( 74)     3002 ( 0.0%, 99.6%): Resize Location { file: "compiler/rustc_hir_typeck/src/expr.rs", line: 2046, col: 29 }, len:NNN
( 75)     2837 ( 0.0%, 99.6%): Resize Location { file: "/home/michal/rust/compiler/rustc_mir_dataflow/src/un_derefer.rs", line: 15, col: 27 }, len:NNN
( 76)     2335 ( 0.0%, 99.6%): Resize Location { file: "compiler/rustc_expand/src/mbe/macro_check.rs", line: 300, col: 25 }, len:NNN
( 77)     2140 ( 0.0%, 99.6%): Resize Location { file: "compiler/rustc_codegen_llvm/src/context.rs", line: 815, col: 38 }, len:NNN
( 78)     1950 ( 0.0%, 99.7%): Resize Location { file: "/home/michal/rust/compiler/rustc_codegen_ssa/src/meth.rs", line: 123, col: 31 }, len:NNN
( 79)     1800 ( 0.0%, 99.7%): Resize Location { file: "compiler/rustc_span/src/source_map.rs", line: 296, col: 40 }, len:NNN

EDIT: changed the results to be weigheted based on reallocation size(which corresponds to the cost of a resize)

@nnethercote
Copy link
Contributor

Yep, this is exactly the "99.9% of the potential benefit would come from 0.1% of the hashmaps" I mentioned earlier :)

github-actions bot pushed a commit to rust-lang/miri that referenced this issue Mar 2, 2025
Change interners to start preallocated with an increased capacity

Inspired by rust-lang/rust#137005.

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by ~1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones.
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Mar 3, 2025
Change interners to start preallocated with an increased capacity

Inspired by rust-lang/rust#137005.

Added a `with_capacity` function to `InternedSet`. Changed the `CtxtInterners` to start with `InternedSets` preallocated with a capacity.

This *does* increase memory usage at very slightly(by ~1 MB at the start), altough that increase quickly disaperars for larger crates(since they require such capacity anyway).

A local perf run indicates this improves compiletimes for small crates(like `ripgrep`), without a negative effect on larger ones.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-medium Call for participation: Medium difficulty. Experience needed to fix: Intermediate. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants