Skip to content

Conversation

@osiewicz
Copy link
Contributor

@osiewicz osiewicz commented Jan 14, 2026

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: #151090

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 14, 2026
@rustbot
Copy link
Collaborator

rustbot commented Jan 14, 2026

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rust-log-analyzer

This comment has been minimized.

@osiewicz osiewicz force-pushed the 151090-checksum-freshness-binary-files branch from abd65b1 to 1d5c70e Compare January 14, 2026 20:30
@jieyouxu
Copy link
Member

@rustbot reroll

@rustbot rustbot assigned mati865 and unassigned jieyouxu Jan 15, 2026
@mati865
Copy link
Member

mati865 commented Jan 16, 2026

Looks good to me, but I don't feel confident enough approving it.

@rustbot reroll

CCing @bjorn3 who I think should be able to give useful feedback here.

@rustbot rustbot assigned fee1-dead and unassigned mati865 Jan 16, 2026
// Build a list of files used to compile the output and
// write Makefile-compatible dependency rules
let mut files: Vec<(String, u64, Option<SourceFileHash>)> = sess
let mut files: IndexMap<String, (u64, Option<SourceFileHash>)> = sess
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make this IndexMap<String, Option<(u64, Option<SourceFileHash>)>> where None indicates that the length and hash haven't been filled in yet?

Copy link
Contributor Author

@osiewicz osiewicz Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that get us anything? This PR makes it so that files with binary contents are inserted into both source_map.files() (with null hash unnormalized_source_len equal to 0) and file_depinfo; with the use of IndexMap, we can override those coming from source_map.files() with hashes coming from file_depinfo. Thus, the value in that IndexMap would never be None.

Now is that the cleanest way? I'm sure not.. but I honestly don't understand implications of removing these "null" binary entries from the sourcemap well enough to make that cut.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes some amount of sense to keep them like this, as null entries? Especially if the has can never really be None as Piotr says. @bjorn3 does Piotr's comment convince you? Or d'ya have a good reason why making it an Option makes sense

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are first inserting into files with a dummy value of hash 0 + len 0 and later overwriting it with the real value, right? My suggestion is to first insert None and later overwrite it with the real value. This way you can unwrap the entry at use and be guaranteed that you didn't accidentally get a dummy value.

Copy link
Contributor Author

@osiewicz osiewicz Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. What's tricky about it is that we have no way of knowing whether the items in files are "bugged" or not.
That is to say, in the context of that function, we have no way to tell if an entry with hash 0 + len 0 originated in a include_bytes!("my_1024mb_file.bin"); call (bugged one, that this PR tries to address) or include_str!("empty_file.txt") (legit one). Both of them have a hash 0 + len 0, and it is the correct value for the include_str one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change line 639 to return None instead of (0, None), right? And adapt the return type of hash_iter_files accordingly. Or am I missing something?

Copy link
Contributor Author

@osiewicz osiewicz Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that branch (on line 639) is not relevant at all and that's not where we were getting our fake-nulls ((hash 0, len 0)) from.
Before this PR, the file path of a binary file (with non-UTF8 contents) was injected into depfile through source_map().files(). For non-utf8 files the values were not correct, since SourceMap only accepts Strings as the contents of a source file. See the following snippet, as that's where we "fake out" contents of a binary file when inserting it into a SourceMap:

let text = std::str::from_utf8(&bytes).unwrap_or("").to_string();

Crucially, we did not/do not call hash_iter_files on these, as we assume that SourceMap calculated the length and a hash correctly.. Which it did not, in some cases (non-utf8 binary files). This then led to binary files having a hash of empty string and a length of 0.

With this PR, that's still the case - binary files are still present in source_map().files() result, but they're also injected into file_depinfo and that lets us call hash_iter_files on them and get the correct hash/file length. The change to files type let's us override the "faulty" value with a good one. However, as outlined in https://github.com/rust-lang/rust/pull/151137/files#r2726164490 , there's no straightforward way (IMHO) to use an option here, because if we knew which entries from the SourceMap were "faulty", we could just call hash_iter_files on them directly without jumping through the hoops.. And line 639 is not really at fault for what we're struggling with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. In that case I'm fine with leaving the code as is.

@fee1-dead
Copy link
Member

@rustbot reroll

@rustbot rustbot assigned jdonszelmann and unassigned fee1-dead Jan 18, 2026
Copy link
Contributor

@jdonszelmann jdonszelmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, if @bjorn3 's concern is lifted, r=me

View changes since this review

// Build a list of files used to compile the output and
// write Makefile-compatible dependency rules
let mut files: Vec<(String, u64, Option<SourceFileHash>)> = sess
let mut files: IndexMap<String, (u64, Option<SourceFileHash>)> = sess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes some amount of sense to keep them like this, as null entries? Especially if the has can never really be None as Piotr says. @bjorn3 does Piotr's comment convince you? Or d'ya have a good reason why making it an Option makes sense

@osiewicz osiewicz force-pushed the 151090-checksum-freshness-binary-files branch from 1d5c70e to 0df94dd Compare January 26, 2026 11:10
Copy link
Contributor Author

@osiewicz osiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed the outstanding review comments and am happy with the code as is; unless anybody objects, I think this PR is good to go (from my side).

View changes since this review

@jdonszelmann
Copy link
Contributor

@bors r+

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 26, 2026

📌 Commit 0df94dd has been approved by jdonszelmann

It is now in the queue for this repository.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 26, 2026
JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 26, 2026
…-binary-files, r=jdonszelmann

checksum-freshness: Fix invalid checksum calculation for binary files

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: rust-lang#151090
rust-bors bot pushed a commit that referenced this pull request Jan 26, 2026
…uwer

Rollup of 4 pull requests

Successful merges:

 - #151137 (checksum-freshness: Fix invalid checksum calculation for binary files)
 - #148187 (Remove uses of `&mut CmResolver`)
 - #151626 (Remove `Deref<Target = TyCtxt>` from `QueryCtxt`)
 - #151661 (Suggest changing `iter`/`into_iter` when the other was meant)
@rust-bors

This comment has been minimized.

rust-bors bot pushed a commit that referenced this pull request Jan 26, 2026
…es, r=jdonszelmann

checksum-freshness: Fix invalid checksum calculation for binary files

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: #151090
@rust-bors rust-bors bot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 26, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 26, 2026

💔 Test for 14ee0d1 failed: CI. Failed job:

@JonathanBrouwer
Copy link
Contributor

@bors r-
Failed ci

@rust-bors rust-bors bot added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jan 26, 2026
@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 26, 2026

Commit 0df94dd has been unapproved.

@rust-bors rust-bors bot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 26, 2026
@JonathanBrouwer
Copy link
Contributor

@bors r=jdonszelmann
Was spurious, github's problem

@rust-bors
Copy link
Contributor

rust-bors bot commented Jan 26, 2026

📌 Commit 0df94dd has been approved by jdonszelmann

It is now in the queue for this repository.

🌲 The tree is currently closed for pull requests below priority 1000. This pull request will be tested once the tree is reopened.

@rust-bors rust-bors bot added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jan 26, 2026
@rust-log-analyzer
Copy link
Collaborator

A job failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
##[error]Could not find a part of the path 'D:\a'.

JonathanBrouwer added a commit to JonathanBrouwer/rust that referenced this pull request Jan 26, 2026
…-binary-files, r=jdonszelmann

checksum-freshness: Fix invalid checksum calculation for binary files

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: rust-lang#151090
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jan 26, 2026
…-binary-files, r=jdonszelmann

checksum-freshness: Fix invalid checksum calculation for binary files

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: rust-lang#151090
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jan 26, 2026
…-binary-files, r=jdonszelmann

checksum-freshness: Fix invalid checksum calculation for binary files

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: rust-lang#151090
rust-bors bot pushed a commit that referenced this pull request Jan 27, 2026
Rollup of 6 pull requests

Successful merges:

 - #148718 (Do not mention `-Zmacro-backtrace` for std macros that are a wrapper around a compiler intrinsic)
 - #151137 (checksum-freshness: Fix invalid checksum calculation for binary files)
 - #150863 (Adds two new Tier 3 targets - `aarch64v8r-unknown-none{,-softfloat}`)
 - #151040 (Don't expose redundant information in `rustc_public`'s `LayoutShape`)
 - #151699 (Update books)
 - #151700 (os allow missing_docs)
Zalathar added a commit to Zalathar/rust that referenced this pull request Jan 27, 2026
…-binary-files, r=jdonszelmann

checksum-freshness: Fix invalid checksum calculation for binary files

Admittedly this is not the cleanest way to achieve this, but SourceMap is quite intertwined with source files being represented as Strings.
Tracking issue: rust-lang/cargo#14136
Closes: rust-lang#151090
rust-bors bot pushed a commit that referenced this pull request Jan 27, 2026
Rollup of 12 pull requests

Successful merges:

 - #147996 (Stabilize ppc inline assembly)
 - #148718 (Do not mention `-Zmacro-backtrace` for std macros that are a wrapper around a compiler intrinsic)
 - #151137 (checksum-freshness: Fix invalid checksum calculation for binary files)
 - #151680 (Update backtrace and windows-bindgen)
 - #150863 (Adds two new Tier 3 targets - `aarch64v8r-unknown-none{,-softfloat}`)
 - #151040 (Don't expose redundant information in `rustc_public`'s `LayoutShape`)
 - #151383 (remove `#[deprecated]` from unstable & internal `SipHasher13` and `24` types)
 - #151529 (lint: Use rustc_apfloat for `overflowing_literals`, add f16 and f128)
 - #151669 (rename uN::{gather,scatter}_bits to uN::{extract,deposit}_bits)
 - #151689 (Fix broken Xtensa installation link)
 - #151699 (Update books)
 - #151700 (os allow missing_docs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

checksum-freshness: does not work with binary embedded files

10 participants