-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Enforce the compiler-builtins partitioning scheme #135395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -165,7 +165,8 @@ where | |
// estimates. | ||
{ | ||
let _prof_timer = tcx.prof.generic_activity("cgu_partitioning_merge_cgus"); | ||
merge_codegen_units(cx, &mut codegen_units); | ||
let cgu_contents = merge_codegen_units(cx, &mut codegen_units); | ||
rename_codegen_units(cx, &mut codegen_units, cgu_contents); | ||
debug_dump(tcx, "MERGE", &codegen_units); | ||
} | ||
|
||
|
@@ -200,7 +201,6 @@ where | |
I: Iterator<Item = MonoItem<'tcx>>, | ||
{ | ||
let mut codegen_units = UnordMap::default(); | ||
let is_incremental_build = cx.tcx.sess.opts.incremental.is_some(); | ||
let mut internalization_candidates = UnordSet::default(); | ||
|
||
// Determine if monomorphizations instantiated in this crate will be made | ||
|
@@ -227,20 +227,8 @@ where | |
} | ||
} | ||
|
||
let characteristic_def_id = characteristic_def_id_of_mono_item(cx.tcx, mono_item); | ||
let is_volatile = is_incremental_build && mono_item.is_generic_fn(); | ||
|
||
let cgu_name = match characteristic_def_id { | ||
Some(def_id) => compute_codegen_unit_name( | ||
cx.tcx, | ||
cgu_name_builder, | ||
def_id, | ||
is_volatile, | ||
cgu_name_cache, | ||
), | ||
None => fallback_cgu_name(cgu_name_builder), | ||
}; | ||
|
||
let cgu_name = | ||
compute_codegen_unit_name(cx.tcx, cgu_name_builder, mono_item, cgu_name_cache); | ||
let cgu = codegen_units.entry(cgu_name).or_insert_with(|| CodegenUnit::new(cgu_name)); | ||
|
||
let mut can_be_internalized = true; | ||
|
@@ -321,7 +309,7 @@ where | |
fn merge_codegen_units<'tcx>( | ||
cx: &PartitioningCx<'_, 'tcx>, | ||
codegen_units: &mut Vec<CodegenUnit<'tcx>>, | ||
) { | ||
) -> UnordMap<Symbol, Vec<Symbol>> { | ||
assert!(cx.tcx.sess.codegen_units().as_usize() >= 1); | ||
|
||
// A sorted order here ensures merging is deterministic. | ||
|
@@ -331,6 +319,13 @@ fn merge_codegen_units<'tcx>( | |
let mut cgu_contents: UnordMap<Symbol, Vec<Symbol>> = | ||
codegen_units.iter().map(|cgu| (cgu.name(), vec![cgu.name()])).collect(); | ||
|
||
// When compiling compiler_builtins, we do not want to put multiple intrinsics in a CGU. | ||
// There may be mergeable CGUs under this constraint, but just skipping over merging is much | ||
// simpler. | ||
if cx.tcx.is_compiler_builtins(LOCAL_CRATE) { | ||
return cgu_contents; | ||
} | ||
|
||
// If N is the maximum number of CGUs, and the CGUs are sorted from largest | ||
// to smallest, we repeatedly find which CGU in codegen_units[N..] has the | ||
// greatest overlap of inlined items with codegen_units[N-1], merge that | ||
|
@@ -421,6 +416,14 @@ fn merge_codegen_units<'tcx>( | |
// Don't update `cgu_contents`, that's only for incremental builds. | ||
} | ||
|
||
cgu_contents | ||
} | ||
|
||
fn rename_codegen_units<'tcx>( | ||
cx: &PartitioningCx<'_, 'tcx>, | ||
codegen_units: &mut Vec<CodegenUnit<'tcx>>, | ||
cgu_contents: UnordMap<Symbol, Vec<Symbol>>, | ||
) { | ||
let cgu_name_builder = &mut CodegenUnitNameBuilder::new(cx.tcx); | ||
|
||
// Rename the newly merged CGUs. | ||
|
@@ -678,13 +681,26 @@ fn characteristic_def_id_of_mono_item<'tcx>( | |
} | ||
} | ||
|
||
fn compute_codegen_unit_name( | ||
tcx: TyCtxt<'_>, | ||
fn compute_codegen_unit_name<'tcx>( | ||
tcx: TyCtxt<'tcx>, | ||
name_builder: &mut CodegenUnitNameBuilder<'_>, | ||
def_id: DefId, | ||
volatile: bool, | ||
mono_item: MonoItem<'tcx>, | ||
cache: &mut CguNameCache, | ||
) -> Symbol { | ||
// When compiling compiler_builtins, we do not want to put multiple intrinsics in a CGU. | ||
// Using the symbol name as the CGU name puts every GloballyShared item in its own CGU, but in | ||
// an optimized build we actually want every item in the crate that isn't an intrinsic to get | ||
// LocalCopy so that it is easy to inline away. In an unoptimized build, this CGU naming | ||
// strategy probably generates more CGUs than we strictly need. But it is simple. | ||
if tcx.is_compiler_builtins(LOCAL_CRATE) { | ||
let name = mono_item.symbol_name(tcx); | ||
return Symbol::intern(name.name); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One of the symbols in compiler-builtins is 132 characters long, together with the crate name and the temporary directory, this could exceed MAX_PATH on Windows I think. Maybe hash the name if its length exceeds say 50 characters |
||
} | ||
|
||
let Some(def_id) = characteristic_def_id_of_mono_item(tcx, mono_item) else { | ||
return fallback_cgu_name(name_builder); | ||
}; | ||
|
||
// Find the innermost module that is not nested within a function. | ||
let mut current_def_id = def_id; | ||
let mut cgu_def_id = None; | ||
|
@@ -712,6 +728,9 @@ fn compute_codegen_unit_name( | |
|
||
let cgu_def_id = cgu_def_id.unwrap(); | ||
|
||
let is_incremental_build = tcx.sess.opts.incremental.is_some(); | ||
let volatile = is_incremental_build && mono_item.is_generic_fn(); | ||
|
||
*cache.entry((cgu_def_id, volatile)).or_insert_with(|| { | ||
let def_path = tcx.def_path(cgu_def_id); | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[package] | ||
name = "scratch" | ||
version = "0.1.0" | ||
edition = "2021" | ||
|
||
[lib] | ||
path = "lib.rs" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
#![no_std] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
//! The compiler_builtins library is special. It exists to export a number of intrinsics which may | ||
//! also be provided by libgcc or compiler-rt, and when an intrinsic is provided by another | ||
//! library, we want that definition to override the one in compiler_builtins because we expect | ||
//! that those implementations are more optimized than compiler_builtins. To make sure that an | ||
//! attempt to override a compiler_builtins intrinsic does not result in a multiple definitions | ||
//! linker error, the compiler has special CGU partitioning logic for compiler_builtins that | ||
//! ensures every intrinsic gets its own CGU. | ||
//! | ||
//! This test is slightly overfit to the current compiler_builtins CGU naming strategy; it doesn't | ||
//! distinguish between "multiple intrinsics are in one object file!" which would be very bad, and | ||
//! "This object file has an intrinsic and also some of its helper functions!" which would be okay. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe filter out all symbols starting with |
||
//! | ||
//! This test ensures that the compiler_builtins rlib has only one intrinsic in each object file. | ||
|
||
// wasm and nvptx targets don't produce rlib files that object can parse. | ||
//@ ignore-wasm | ||
//@ ignore-nvptx64 | ||
|
||
#![deny(warnings)] | ||
|
||
use std::str; | ||
|
||
use run_make_support::object::read::Object; | ||
use run_make_support::object::read::archive::ArchiveFile; | ||
use run_make_support::object::{ObjectSymbol, SymbolKind}; | ||
use run_make_support::rfs::{read, read_dir}; | ||
use run_make_support::{cargo, object, path, target}; | ||
|
||
fn main() { | ||
println!("Testing compiler_builtins CGU partitioning for {}", target()); | ||
|
||
// CGU partitioning has some special cases for codegen-units=1, so we also test 2 CGUs. | ||
for cgus in [1, 2] { | ||
for profile in ["debug", "release"] { | ||
run_test(profile, cgus); | ||
} | ||
} | ||
} | ||
|
||
fn run_test(profile: &str, cgus: usize) { | ||
println!("Testing with profile {profile} and -Ccodegen-units={cgus}"); | ||
|
||
let target_dir = path("target"); | ||
|
||
let mut cmd = cargo(); | ||
cmd.args(&[ | ||
"build", | ||
"--manifest-path", | ||
"Cargo.toml", | ||
"-Zbuild-std=core", | ||
"--target", | ||
&target(), | ||
]) | ||
.env("RUSTFLAGS", &format!("-Ccodegen-units={cgus}")) | ||
.env("CARGO_TARGET_DIR", &target_dir) | ||
.env("RUSTC_BOOTSTRAP", "1") | ||
// Visual Studio 2022 requires that the LIB env var be set so it can | ||
// find the Windows SDK. | ||
.env("LIB", std::env::var("LIB").unwrap_or_default()); | ||
if profile == "release" { | ||
cmd.arg("--release"); | ||
} | ||
cmd.run(); | ||
|
||
let rlibs_path = target_dir.join(target()).join(profile).join("deps"); | ||
let compiler_builtins_rlib = read_dir(rlibs_path) | ||
.find_map(|e| { | ||
let path = e.unwrap().path(); | ||
let file_name = path.file_name().unwrap().to_str().unwrap(); | ||
if file_name.starts_with("libcompiler_builtins") && file_name.ends_with(".rlib") { | ||
Some(path) | ||
} else { | ||
None | ||
} | ||
}) | ||
.unwrap(); | ||
|
||
// rlib files are archives, where the archive members are our CGUs, and we also have one called | ||
// lib.rmeta which is the encoded metadata. Each of the CGUs is an object file. | ||
let data = read(compiler_builtins_rlib); | ||
|
||
let archive = ArchiveFile::parse(&*data).unwrap(); | ||
for member in archive.members() { | ||
let member = member.unwrap(); | ||
if member.name() == b"lib.rmeta" { | ||
continue; | ||
} | ||
let data = member.data(&*data).unwrap(); | ||
let object = object::File::parse(&*data).unwrap(); | ||
|
||
let mut global_text_symbols = 0; | ||
println!("Inspecting object {}", str::from_utf8(&member.name()).unwrap()); | ||
for symbol in object | ||
.symbols() | ||
.filter(|symbol| matches!(symbol.kind(), SymbolKind::Text)) | ||
.filter(|symbol| symbol.is_definition() && symbol.is_global()) | ||
{ | ||
println!("symbol: {:?}", symbol.name().unwrap()); | ||
global_text_symbols += 1; | ||
} | ||
// Assert that this object/CGU does not define multiple global text symbols. | ||
// We permit the 0 case because some CGUs may only be assigned a static. | ||
assert!(global_text_symbols <= 1); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to make inlining inside the crate more likely without causing MIR for all functions in compiler-builtins to get encoded in the crate metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what you're pointing out here is that these functions are not reachable as MIR, so we don't need to encode MIR for them. The problem as I see it is that our notion of reachable uses this worklist/visited algorithm that tracks items in a path-independent way:
rust/compiler/rustc_passes/src/reachable.rs
Lines 168 to 173 in 2ae9916
Also we already have an issue for the inverse inefficiency, emitting object code when we only need MIR: #119214
I put a hack in this place specifically because the compiler is designed around this function either true or false for whatever reason, past the first few checks. I'm not aware of anywhere else we could make a small localized change to get the behavior we want.
The only other place I could think of putting a hack is
MonoItem::instantiation_mode
, but that doesn't work because then we get linker errors because instantiation mode needs to agree withexported_symbols
, and those disagree because becauseexported_symbols
is based onreachable_set
. I really think the inaccuracy of thereachable_set
analysis is the root problem here, and it's net better to implement this in a non-invasive way that will be fixed automatically ifreachable_set
gets improved.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if I back up to my merge-base,
x build library
, thenar x
the stage1-std libcompiler_builtins.rlib and rundu -sch *
I get:Then with my changes:
So even though it's not perfect, this PR is still a net win.