Skip to content

WIP: Support changing crate name case #3034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Rantanen
Copy link

Is #1451 still on the table?

I made some implementation here, but the download functionality becomes somewhat problematic. Currently the crate download works by redirecting the client to AWS with path of:

/crates/<crate_name>/<crate_name>-<version>.crate.

This causes issues if the crate case is later renamed since the redirection will end up attempting to download the previously updated crate packages with the new name.

The solution here was to add upload_name column to the versions table that gets populated by the current crate name when migrating the database or publishing new versions.

When I started the implementation, I figured this would be just a simple matter of supporting the rename in the database in this specific case, however now that I encountered the issue with the file storage, I'd like to confirm whether this approach is suitable for crates.io at all.

The two concerns I have are:

  • Each version needs to store the crate name separately.
  • The database migration needs to update all versions.

With ~50k crates according to crates.io front page and assuming around 10 versions per crate on average, that would yield an estimate of 500k versions. I'd guess that updating 500k versions during migration isn't too big of an issue. If the average crate name is ~20 characters, the increase in database size would be around 10 MB (plus some database overhead), which doesn't sound too bad either.

The current implementation attached to this PR is work in progress - I just wanted to get a PR up for discussion. The unimplemented bits are setting the upload_name when a new version is published and fixing the test case for the redirect.

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @carols10cents (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@jtgeibel
Copy link
Member

jtgeibel commented Nov 22, 2020

Thanks for your interest in this @Rantanen. This is an area where we need to be very careful about our integration with cargo and also consider downstream consumers (like docs.rs and lib.rs). There are several places I see where we need to be careful:

  • The index repository
    • The path to the file with the crate's information likely needs to remain fixed, with the original casing, for compatibility reasons. While it might be possible to rename the file to the preferred casing, I don't see any reason to at this point unless other design considerations make it worth considering. [Edit: the file path in the index is always converted to lowercase.]
    • Within the file, should the name property for a version reflect the canonical (original) spelling or the casing that version was published with?
  • The published crate asset
    • What casing is used for the file name? I haven't looked to see if cargo uses the filename as served from S3 or constructs it's own when saving it in .cargo/registry/cache/*/. cargo and other tools that download crate assets could become confused if the expected casing changes between versions.
    • For security reasons, when a crate is published we check that the folder name within the tarball is {crate_name}-{crate_version}. This folder name determines where the files will be extracted under .cargo/registry/src/*/.
    • The name within the embedded Cargo.toml and Cargo.toml.orig files.
    • In general, I think it would be confusing if any of these strings differ in casing between these locations for a individual release. I think it would also cause issues (either with cargo or other tools) to have the casing change from one release to the next. Every tool working with crate assets would need to handle this correctly, when for the past 5+ years code assuming that these values would not change has been working correctly.

Finally, if we had solid answers to these questions, we would want the implementation to have solid integration tests between cargo and the crates.io service. Unfortunately, we don't yet have the testing infrastructure in place to do the sort of end-to-end testing (i.e. invoking cargo from our tests) we would want for this sort of change.

I've opened rust-lang/cargo#8883 as an approach to help users avoid mistakes here.

@Rantanen
Copy link
Author

  • The path to the file with the crate's information likely needs to remain fixed, with the original casing, for compatibility reasons. While it might be possible to rename the file to the preferred casing, I don't see any reason to at this point unless other design considerations make it worth considering.

The path is already lower cased: Metamorfish crate is found in the index under me/ta/metamorfish

I feel at least some of the concerns would be alleviated by requiring a major version bump when changing the casing. Then excluding version = "*" dependencies, at least Cargo wouldn't be stuck resolving a crate "Foo" and ending up with a version for one called "foo". Can't speak for lib.rs/etc.

Also just to be clear, I'm okay with closing this PR as "won't fix" as long as the original issues gets the same conclusion. Having a clear answer of "this won't happen" would be nicer than having a "this would be nice (but in reality we won't do anything to it to preserve backwards compatibility)". :)

(I'm not suggesting this should be closed, but the amount of concerns seems quite large and I'm not sure if having this possibility outweighs the risks. While I already got quite attached to my Metamorfish, I'm currently thinking of a new name for that crate to workaround this issue myself.)

@jtgeibel
Copy link
Member

Thanks for the correction on the file path. I should have double checked that.

The main alternative I see is that we could store a per crate display name, with a constraint that the crate and display names normalize to the same value. Then the exact crate name could be an implementation detail that is less prominent.

Right now cargo will suggest the correct crate name if the wrong casing is given. If it silently fixed the name, then we could always display the preferred version on the site, however rustc would have to handle this too. As you point out, right now if we allowed a rename, crates could only change casing on a major version bump because every usage of Metamorfish in a path or use statement would have to be changed to metamorfish. (Note that the same is not true for - vs _ because the language only allows the latter.)

So it looks like this would probably require an RFC covering crates.io, cargo, and the language to really get it right. For that reason, I agree it is best to close #1451 as well. We can always reopen it if someone finds an approach with fewer sharp edges.

@jtgeibel jtgeibel closed this Nov 24, 2020
@Razican
Copy link

Razican commented Jan 11, 2021

So it looks like this would probably require an RFC covering crates.io, cargo, and the language to really get it right. For that reason, I agree it is best to close #1451 as well. We can always reopen it if someone finds an approach with fewer sharp edges.

Hi @jtgeibel, is there a tracking issue / RFC in order to implement this plan?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants