-
Notifications
You must be signed in to change notification settings - Fork 643
Remove restriction that a crate must be in the database to get a download URL #405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove restriction that a crate must be in the database to get a download URL #405
Conversation
Thanks for the PR @carols10cents! This sounds pretty reasonable to me, and I'm game! Exciting to see work on mirrors :) I personally prefer to ignore results through either I wonder though, perhaps just the failed-to-lookup error could be ignored, but things like db errors could still get plumbed through? Also, as a side note, with source replacement now on nightly I believe it should be as simple as: # in .cargo/config
[source.crates-io]
registry = 'https://github.com/rust-lang/crates.io-index'
replace-with = 'my-mirror'
[source.my-mirror]
registry = 'https://...' |
And with that as well Cargo will generate lock files that are interoperable with the default registry as well as mirror registries! |
If we're a mirror, though, we'd still want to ignore all errors, including db errors, right? And send the download URL regardless? I have this thought that I haven't researched the implications of yet, that some mirrors might not need to have a database at all. And if we weren't acting as a mirror (in this case, that we do want to be counting downloads), then we actually don't want to ignore ANY errors from updating the download counts, right? :-/ It almost feels like we'd need to have a setting that said whether we were a mirror or not, but then we'd also want to be able to say what kind of mirror we were: are we trying to be an exact match of crates.io? A superset that allows additional (private) uploads? A subset that has crates that have been audited and approved? This is where I start getting overwhelmed as to how much to start with, and how to implement a little bit at a time but be extensible :) I'd love ideas! |
I added a new commit to change to
This is so awesome!!! My [source.mirror]
registry = "https://gitlab.com/integer32llc/crates.io-index"
[source.crates-io]
replace-with = "mirror" and I can remove that, do a clean rebuild, and the |
Thanks! So thinking about this a bit more, this means that if you're trying to download "crate-that-does-not-exist" at "version-that-does-not-exist", you'd get a 302 to go talk to S3? We may want to head that off quickly with a 404 (or at least a 200 with a JSON error like we do today), though? Or at least crates.io itself may wish to do that even if mirrors do not |
Before we even get to that point, though, cargo looks in its locally checked out index and won't even make a request to this route if the crate+version isn't in its index. I'm not sure how/if/how likely it is to get into a situation where a crate+version IS in the index but ISN'T on s3... do you have any ideas?
The problem I'm having is how to tell if we're crates.io or if we're a mirror. What mechanism would you suggest? Should I add another environment variable that's like... |
Yeah it's true that Cargo itself may not send these requests to S3, but I'm also somewhat worried about other clients who download directly from crates.io for whatever reason (or future Cargo implementations perhaps). Maybe it's not worth worrying about them though? I guess thinking a bit more about what this should be (in terms of mechanisms), what kind of mirror are you going for here? I believe that this is going to a mirror of the index, but not necessarily the crates themselves, right? Maybe a caching mirror would look different where it'd lazily fill in packages from S3 but it still would require everything to exist locally? |
For this step, I'm going for an API-only mirror. I thought this would be a good midpoint between what exists now and a mirror that lazily caches the packages to its own storage, but if you'd rather I proceed directly to trying to get a mirror like that working, I'd be into that :) A lazy cache mirror still would have the same problem, in that when you hit the download route, it needs to have different behavior than crates.io does now, so how do we configure a mirror vs not a mirror? |
Hm yeah so if we want to end up at just an API mirror then this makes sense to me (perhaps with a setting for crates.io to differ), but if that's not the end goal we'd end up changing this anyway, right? I guess I'd expect a download route to behave with logic like:
In any case though I'd be fine merging this for now with something like |
Yeah i have no idea if an API-only mirror is anything that's useful by itself! I mean, it would work, I just don't know if that's anything useful enough for anyone to run. (shrug) |
By default, we are not a mirror.
This might fail because we don't have any crates in the database; we just want to be an API mirror and pass along the redirect URL.
f54dceb
to
4a7d270
Compare
Added the MIRROR=1 setting! :) |
Looks good to me, thanks! |
….0.0, r=Turbo87 Bump ember-fetch from 6.5.1 to 7.0.0 Bumps [ember-fetch](https://github.com/ember-cli/ember-fetch) from 6.5.1 to 7.0.0. <details> <summary>Release notes</summary> *Sourced from [ember-fetch's releases](https://github.com/ember-cli/ember-fetch/releases).* > ## Release 7.0.0 > #### 💥 Breaking Change > * [#292](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/292) Added a check to enforce top-level dependency [#290](https://github-redirect.dependabot.com/ember-cli/ember-fetch/issues/290). ([@​dnalagatla](https://github.com/dnalagatla)) > * [#348](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/348) Bump deps, drop node 6 ([@​xg-wang](https://github.com/xg-wang)) > > #### 📝 Documentation > * [#405](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/405) Add release-it ([@​xg-wang](https://github.com/xg-wang)) > * [#399](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/399) Add section to the README for deprecations ([@​locks](https://github.com/locks)) > > #### 🏠 Internal > * [#405](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/405) Add release-it ([@​xg-wang](https://github.com/xg-wang)) > > #### Committers: 4 > - Dinesh Nalagatla ([@​dnalagatla](https://github.com/dnalagatla)) > - Ricardo Mendes ([@​locks](https://github.com/locks)) > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > - [[@​dependabot-preview](https://github.com/dependabot-preview)[bot]](https://github.com/apps/dependabot-preview) > > ## v6.7.2 (2019-11-03) > > #### 🐛 Bug Fix > * [#372](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/372) fix: throwing w/ fresh ember-cli-fastboot serve ([@​xg-wang](https://github.com/x > g-wang)) > > #### Committers: 1 > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > > ## v6.7.1 (2019-09-12) > > #### 🐛 Bug Fix > * [#358](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/358) Enable absolute url transform for Request in FastBoot ([@​xg-wang](https://github.com/xg-wang)) > > #### Committers: 2 > - Jan Bobisud ([@​bobisjan](https://github.com/bobisjan)) > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > > ## v6.7.0 > #### 🚀 Enhancement > * [#303](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/303) Add fetch response and error utils ([@​BarryThePenguin](https://github.com/BarryThePenguin)) > > #### Committers: 3 > - Jonathan Haines ([@​BarryThePenguin](https://github.com/BarryThePenguin)) > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > - [[@​dependabot-preview](https://github.com/dependabot-preview)[bot]](https://github.com/apps/dependabot-preview) > > ## v6.6.0 > #### 📝 Documentation > * [#308](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/308) Deprecate ember-data adapter mixin ([@​xg-wang](https://github.com/xg-wang)) > > #### Committers: 2 ></tr></table> ... (truncated) </details> <details> <summary>Changelog</summary> *Sourced from [ember-fetch's changelog](https://github.com/ember-cli/ember-fetch/blob/master/CHANGELOG.md).* > ## v7.0.0 (2019-11-28) > > #### 💥 Breaking Change > * [#292](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/292) Added a check to enforce top-level dependency [#290](https://github-redirect.dependabot.com/ember-cli/ember-fetch/issues/290). ([@​dnalagatla](https://github.com/dnalagatla)) > * [#348](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/348) Bump deps, drop node 6 ([@​xg-wang](https://github.com/xg-wang)) > > #### 📝 Documentation > * [#405](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/405) Add release-it ([@​xg-wang](https://github.com/xg-wang)) > * [#399](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/399) Add section to the README for deprecations ([@​locks](https://github.com/locks)) > > #### 🏠 Internal > * [#405](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/405) Add release-it ([@​xg-wang](https://github.com/xg-wang)) > > #### Committers: 4 > - Dinesh Nalagatla ([@​dnalagatla](https://github.com/dnalagatla)) > - Ricardo Mendes ([@​locks](https://github.com/locks)) > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > - [[@​dependabot-preview](https://github.com/dependabot-preview)[bot]](https://github.com/apps/dependabot-preview) > > ## v6.7.2 (2019-11-03) > > #### 🐛 Bug Fix > * [#372](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/372) fix: throwing w/ fresh ember-cli-fastboot serve ([@​xg-wang] > g-wang)) > > #### Committers: 1 > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > > ## v6.7.1 (2019-09-12) > > #### 🐛 Bug Fix > * [#358](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/358) Enable absolute url transform for Request in FastBoot ([@​xg-wang](https://github.com/xg-wang)) > > #### Committers: 2 > - Jan Bobisud ([@​bobisjan](https://github.com/bobisjan)) > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > > ## v6.7.0 (2019-07-08) > > #### 🚀 Enhancement > > - [#303](https://github-redirect.dependabot.com/ember-cli/ember-fetch/pull/303) Add fetch response and error utils ([@​BarryThePenguin](https://github.com/BarryThePenguin)) > > #### Committers: 3 > > - Jonathan Haines ([@​BarryThePenguin](https://github.com/BarryThePenguin)) > - Thomas Wang ([@​xg-wang](https://github.com/xg-wang)) > - [[@​dependabot-preview](https://github.com/dependabot-preview)[bot]](https://github.com/apps/dependabot-preview) > > ## v6.6.0 (2019-06-28) ></tr></table> ... (truncated) </details> <details> <summary>Commits</summary> - [`e525762`](ember-cli/ember-fetch@e525762) Release 7.0.0 - [`231f20d`](ember-cli/ember-fetch@231f20d) Add publishConfig.registry - [`424cead`](ember-cli/ember-fetch@424cead) Merge pull request [#405](https://github-redirect.dependabot.com/ember-cli/ember-fetch/issues/405) from xg-wang/ri - [`4e2fe99`](ember-cli/ember-fetch@4e2fe99) pick v6.7.2 and add release-it - [`bac5db0`](ember-cli/ember-fetch@bac5db0) Merge pull request [#292](https://github-redirect.dependabot.com/ember-cli/ember-fetch/issues/292) from dnalagatla/dnalagatla/enforce-top-level-addon - [`914f60a`](ember-cli/ember-fetch@914f60a) Bump @typescript-eslint/eslint-plugin from 2.8.0 to 2.9.0 - [`82a41f9`](ember-cli/ember-fetch@82a41f9) Bump ember-cli-babel from 7.12.0 to 7.13.0 - [`5b9720e`](ember-cli/ember-fetch@5b9720e) Bump ember-source from 3.14.1 to 3.14.2 - [`512fdf8`](ember-cli/ember-fetch@512fdf8) Bump @typescript-eslint/parser from 2.8.0 to 2.9.0 - [`a8eb84a`](ember-cli/ember-fetch@a8eb84a) Bump ember-try from 1.2.1 to 1.3.0 - Additional commits viewable in [compare view](ember-cli/ember-fetch@v6.5.1...v7.0.0) </details> <br /> [](https://dependabot.com/compatibility-score.html?dependency-name=ember-fetch&package-manager=npm_and_yarn&previous-version=6.5.1&new-version=7.0.0) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- **Note:** This repo was added to Dependabot recently, so you'll receive a maximum of 5 PRs for your first few update runs. Once an update run creates fewer than 5 PRs we'll remove that limit. You can always request more updates by clicking `Bump now` in your [Dependabot dashboard](https://app.dependabot.com). <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot badge me` will comment on this PR with code to add a "Dependabot enabled" badge to your readme Additionally, you can set the following in the `.dependabot/config.yml` file in this repo: - Update frequency - Automerge options (never/patch/minor, and dev/runtime dependencies) - Out-of-range updates (receive only lockfile updates, if desired) - Security updates (receive only security updates, if desired) </details>
So I've been thinking about/working on making it easier to run a crates.io mirror. The mirror in particular that I'm working on would basically just mirror the crates.io API:
.cargo/config
to point to an alternate indexS3_BUCKET
set tocrates-io
and itsS3_REGION
set tous-west-1
, so it would NOT be mirroring the actual crate filesThe problem I hit was that in order to get a redirect from
/crates/:crate_id/:version/download
to S3, the crate and version needed to be in the database, because the current code would try to update the download counts.I thought about creating a script that would load the contents of the index back into a database, but then it would have to keep that in sync as changes to the index happened, so I stepped back from that tactic and decided to try this.
This PR tries to increment the download count, but if that isn't possible for any reason, return the redirect to S3 anyway. Cargo checks the index it has before requesting the URL from the API anyway, so as long as the index is a (possibly lagging behind) mirror, cargo should not be requesting files that don't exist on S3. If it does, cargo will just say "failed to download package whatever from http://...s3". This might cause other problems to go unnoticed, but seeing as this is just the download count and any problems with the database would likely manifest in other ways, I think this is ok.
Is there a more idiomatic way to use the
Result
but ignore errors? I know that's not recommended in general, but that's really what I'm trying to do here :)With this patch deployed to https://cratesio-mirror.herokuapp.com/, I'm able to set my
.cargo/config
to use https://gitlab.com/integer32llc/crates.io-index and install crates!I'm totally open to other ideas if there's a better way to solve this problem!