Simplify implementation of Rust intrinsics by using type parameters in the cache #142259

sayantn · 2025-06-09T18:54:32Z

The current implementation of intrinsics have a lot of duplication to handle different overloads of overloaded LLVM intrinsic. This PR uses the base name and the type parameters in the cache instead of the full, overloaded name. This has the benefit that call_intrinsic doesn't need to provide the full name, rather the type parameters (which is most of the time more available). This uses LLVMIntrinsicCopyOverloadedName2 to get the overloaded name from the base name and the type parameters, and only uses it to declare the function.

(originally was part of #140763, split off later)

@rustbot label A-codegen A-LLVM
r? codegen

rustbot · 2025-06-09T18:54:39Z

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

workingjubilee · 2025-06-09T22:42:07Z

This seems like it might be perf-sensitive.

@bors2 try @rust-timer queue

rust-bors · 2025-06-09T22:42:12Z

⌛ Trying commit ea453f7 with merge 57fad72…

To cancel the try build, run the command @bors2 try cancel.

Simplify implementation of Rust intrinsics by using type parameters in the cache The current implementation of intrinsics have a lot of duplication to handle different overloads of overloaded LLVM intrinsic. This PR uses the **base name and the type parameters** in the cache instead of the full, overloaded name. This has the benefit that `call_intrinsic` doesn't need to provide the full name, rather the type parameters (which is most of the time more available). This uses `LLVMIntrinsicCopyOverloadedName2` to get the overloaded name from the base name and the type parameters, and only uses it to declare the function. (originally was part of #140763, split off later) `@rustbot` label A-codegen A-LLVM r? codegen

rust-bors · 2025-06-10T01:12:35Z

☀️ Try build successful (CI)
Build commit: 57fad72 (57fad72009e5865a872ba3d86eccf0f3a1917f99)

rust-timer · 2025-06-10T13:39:27Z

Finished benchmarking commit (57fad72): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 1.7%, secondary 2.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	1.7%	[1.7%, 1.7%]	1
Regressions ❌ (secondary)	2.9%	[2.3%, 3.5%]	2
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	1.7%	[1.7%, 1.7%]	1

Cycles

Results (secondary -2.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.3%	[-2.3%, -2.3%]	1
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 754.509s -> 753.616s (-0.12%)
Artifact size: 372.30 MiB -> 372.29 MiB (-0.00%)

sayantn · 2025-06-10T14:17:15Z

@workingjubilee seems like there is no perf impact! (I am surprised actually)

workingjubilee · 2025-06-10T16:23:05Z

neat, I will try to take a closer look later but this is a very good cleanup so I expect to be approving it later today.

nikic · 2025-06-10T20:03:26Z

compiler/rustc_codegen_llvm/src/context.rs

@@ -861,372 +869,156 @@ impl<'ll> CodegenCx<'ll, '_> {
        } else {
            self.type_variadic_func(&[], ret)
        };
-        let f = self.declare_cfn(name, llvm::UnnamedAddr::No, fn_ty);
-        self.intrinsics.borrow_mut().insert(name, (fn_ty, f));
+


So here we are still creating the intrinsic function type based on the argument and return value types -- however, the signature is already uniquely determined by the base_name and type_params.

I think it would be a lot better to use LLVMGetIntrinsicDeclaration accepting the ID and type parameters. This means that we no longer needs args and ret -- and crucially, this means we don't need to maintain the list of intrinsics inside declare_intrinsic anymore!

I do know about Intrinsic::getDeclaration. The reason I didn't use this is performance. The problem with the C API function LLVMGetIntrinsicDeclaration is that it computes the FunctionType (because Intrinsic::getDeclaration does), and then throws it away (It calls getCallee on the FunctionCallee object returned by Intrinsic::getDeclaration). So in total, I have to compute the FunctionType twice, which is pretty expensive. So I just duplicated the implementation of getDeclaration, but with a known FunctionType.

The reason I didn't get rid of the list of intrinsics is also performance - Intrinsic::getType calls are expensive, and calling that every time we are comparing 2 integers seems like a major perf issue.

I believe that nikic's proposed change can be implemented on top of this simplified file. I think we should merge this because it's already better, but I would indeed like to see an even simpler version and we can run perf on it to see if it affects anything.

So are you suggesting to completely remove the cache? Or should I keep a cache like FxHashMap<(String, SmallVec<[&'ll Type; 2]>), (&'ll Type, &'ll Value)> that will dynamically be filled whenever an intrinsic is called?

Another PR with this as a base, then we can have a nice chat about where to go.

I do know about Intrinsic::getDeclaration. The reason I didn't use this is performance. The problem with the C API function LLVMGetIntrinsicDeclaration is that it computes the FunctionType (because Intrinsic::getDeclaration does), and then throws it away (It calls getCallee on the FunctionCallee object returned by Intrinsic::getDeclaration). So in total, I have to compute the FunctionType twice, which is pretty expensive. So I just duplicated the implementation of getDeclaration, but with a known FunctionType.

Intrinsic::getDeclaration() does not discard the FunctionType. The FunctionType is part of the Function. You can fetch it using get_type_of_global().

The reason I didn't get rid of the list of intrinsics is also performance - Intrinsic::getType calls are expensive, and calling that every time we are comparing 2 integers seems like a major perf issue.

I don't really follow how having a list of intrinsics makes things faster. Doesn't going through the list and matching all the names just add cost? Besides, you are caching the result anyway, so whatever you do, it will only happen once per intrinsic + type?

@nikic sorry I misinterpreted your comment as a suggestion to completely remove the cache. Yes, that makes sense. I am holding it off for this PR, let's just have another PR for it as @workingjubilee suggested

compiler/rustc_codegen_llvm/src/context.rs

…n the cache

workingjubilee · 2025-06-13T04:39:45Z

@bors r+

bors · 2025-06-13T04:39:48Z

📌 Commit d56fcd9 has been approved by workingjubilee

It is now in the queue for this repository.

rustbot assigned workingjubilee Jun 9, 2025

rustbot added the A-codegen Area: Code generation label Jun 9, 2025

This comment has been minimized.

Sign in to view

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 9, 2025

This comment has been minimized.

Sign in to view

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 10, 2025

nikic reviewed Jun 10, 2025

View reviewed changes

workingjubilee requested changes Jun 11, 2025

View reviewed changes

compiler/rustc_codegen_llvm/src/context.rs Show resolved Hide resolved

compiler/rustc_codegen_llvm/src/context.rs Outdated Show resolved Hide resolved

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 11, 2025

Simplify implementation of Rust intrinsics by using type parameters i…

d56fcd9

…n the cache

sayantn force-pushed the simplify-intrinsics branch from ea453f7 to d56fcd9 Compare June 11, 2025 19:03

sayantn requested a review from workingjubilee June 11, 2025 19:06

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 11, 2025

workingjubilee approved these changes Jun 13, 2025

View reviewed changes

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 13, 2025

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Jun 13, 2025

Simplify implementation of Rust intrinsics by using type parameters in the cache #142259

Are you sure you want to change the base?

Simplify implementation of Rust intrinsics by using type parameters in the cache #142259

Conversation

sayantn commented Jun 9, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jun 9, 2025

Uh oh!

workingjubilee commented Jun 9, 2025

Uh oh!

This comment has been minimized.

rust-bors bot commented Jun 9, 2025

Uh oh!

rust-bors bot commented Jun 10, 2025

Uh oh!

This comment has been minimized.

rust-timer commented Jun 10, 2025

Overall result: no relevant changes - no action needed

Uh oh!

sayantn commented Jun 10, 2025

Uh oh!

workingjubilee commented Jun 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayantn Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

workingjubilee Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

workingjubilee commented Jun 13, 2025

Uh oh!

bors commented Jun 13, 2025

Uh oh!

Uh oh!

sayantn commented Jun 9, 2025 •

edited by rustbot

Loading

sayantn Jun 11, 2025 •

edited

Loading

workingjubilee Jun 11, 2025 •

edited

Loading