Skip to content

Better documentation for docs.rs/:crate/latest #854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
robinmoussu opened this issue Jun 26, 2020 · 17 comments
Closed

Better documentation for docs.rs/:crate/latest #854

robinmoussu opened this issue Jun 26, 2020 · 17 comments
Labels
A-frontend Area: Web frontend P-low Low priority issues S-needs-design Status: There's a problem here, but no obvious solution; or the solution raises other questions

Comments

@robinmoussu
Copy link

Currently urls looks like this: https://docs.rs/$crate/$version/$crate/$page.html. I propose that it should be possible to use the following alias https://docs.rs/$crate/$page.html. This page would just always displays the docs of the latest version of the crate.

  • I think that most user always want to consult the latest version of the docs. If not it should always be possible to use the url that include a version number.
  • It would make it easier to have cross-site links that are always up-to-date (like in stackoverflow).
  • I don't know if it's possible but it would be nice to tell in the robot.txt that only those pages should be indexed and not the one with explicit version numbers (so all indexed pages would always points to the latest version, instead of (usually) outdated ones).
@jyn514
Copy link
Member

jyn514 commented Jun 26, 2020

This is one of our earliest features: https://docs.rs/$crate/latest/$crate/$page.html. How could we better document this? Currently it's on https://docs.rs/about and also in the readme.

@jyn514 jyn514 changed the title [feature request] url that always points to the docs of the latest version Better documentation for docs.rs/:crate/latest Jun 26, 2020
@jyn514
Copy link
Member

jyn514 commented Jun 26, 2020

I don't know if it's possible but it would be nice to tell in the robot.txt that only those pages should be indexed and not the one with explicit version numbers (so all indexed pages would always points to the latest version, instead of (usually) outdated ones).

Currently our robots.txt is almost blank 😅 it would be good to clean it up. There's a guide at https://support.google.com/webmasters/answer/6062596?hl=en&ref_topic=6061961 that looks useful, with a full reference here (see especially the bit about matching based on path values).

@robinmoussu
Copy link
Author

How could we better document this?

The go to latest version should redirect to that page. And I think that it would naturally increase the score of that page (since that page would be linked by many other page) that would naturally push the latest version at the top of the result of search engine.

@jyn514
Copy link
Member

jyn514 commented Jun 26, 2020

That page is itself a redirect so that we can avoid caching the latest page (imagine: someone goes to 1.0, they 'go to latest version' which is 2.0, now 3.0 is published but /latest is cached to point to 2.0). So I don't think there's any point redirecting there, you'll never see it in the address bar.

@jyn514
Copy link
Member

jyn514 commented Jun 26, 2020

@jyn514 jyn514 added the A-frontend Area: Web frontend label Jun 26, 2020
@Kixiron
Copy link
Member

Kixiron commented Jun 26, 2020

I think pretty much all we can do is mess with what search engines index, I've personally run into stale doc links (read: not latest) on searches and that'd be a good enhancement going forward. As for documenting it, I think it's really documented as well as it can be, and adding even more redirect logic than we have to doesn't sound fun and would require (most likely) another request to S3 which isn't ideal for what's already our slowest page class. While looking at this I also noticed that our sitemap uses urls in the https://docs.rs/:crate form when we should probably be using the https://docs.rs/crate/:crate form to avoid reserved names. Reiterating on what Jyn said, changing the go to latest links wouldn't do anything meaningful for human interactions since people don't look at links they click and wouldn't have time to see it. As for robot interactions with that link, #143 asks whether or not we should be allowing search engines to index us in a no-holds-barred fashion

@robinmoussu
Copy link
Author

It's wild idea, feel free to reject it (and anyway I don't think have the competences to do it myself).

Each time a new version it published, two set of pages would be generated:

  • https://docs.rs/$crate/latest/$crate/$page.html
  • https://docs.rs/$crate/$version/$crate/$page.html

This means the following things would change:

  • Unlike what is done currently, lastest wouldn't be a redirect, but real HTML pages.
  • All links in $version would redirect to $version, while all link in latest would redirect to latest. If someone have a cached version of an outdated page in latest, all links would therefore points to non-outdated page.
  • All 404 error inside https://docs.rs/$crate/latest/$crate/ would have a link to https://docs.rs/$crate/latest/$crate/index.html
  • All $version page would have a button go to latest version which would points to the equivalent page but the latest path. This button never need to be updated, but may be hidden/be less visible if the page is already the latest version (if deemed useful).

This means that most of the time people would use latest pages. The $version page would be nearly never accessed. This means that when people copy-paste links, for example on stackoverflow, there is a much higher chance that the link wouldn't be outdated. This would also increase cache hit since latest would most probably be already cached by CDNs given that nearly only latest page would be accessed.

@rust-lang rust-lang deleted a comment from robinmoussu Jun 26, 2020
@rust-lang rust-lang deleted a comment from robinmoussu Jun 26, 2020
@jyn514
Copy link
Member

jyn514 commented Jun 26, 2020

Each time a new version it published, two set of pages would be generated:

  • https://docs.rs/$crate/latest/$crate/$page.html
  • https://docs.rs/$crate/$version/$crate/$page.html

This is not a workable solution, it would double our storage costs.

What we could do instead is serve https://docs.rs/$crate/$latest_version/$crate/$page.html whenever someone accesses https://docs.rs/$crate/latest/$crate/$page.html. However I'm not sure that's the best approach: now people no longer know what version of the docs they're looking at off-hand, and we confuse search engines because the same content is duplicated on two different pages.

@pietroalbini
Copy link
Member

What we could do instead is serve https://docs.rs/$crate/$latest_version/$crate/$page.html whenever someone accesses https://docs.rs/$crate/latest/$crate/$page.html. However I'm not sure that's the best approach: now people no longer know what version of the docs they're looking at off-hand, and we confuse search engines because the same content is duplicated on two different pages.

I think this is the right approach, and the version is already included in the heading anyway. Regarding search engines, we could include a rel=canonical link in every page:

<link rel="canonical" href="https://docs.rs/{crate}/latest/{path}">

@jyn514
Copy link
Member

jyn514 commented Jun 29, 2020

Ok, it shouldn't be too hard to implement then. I don't think we want to make /latest/ the canonical page though since it will be constantly changing, instead /{version} should be the canonical page.

@jyn514 jyn514 added E-easy Effort: Should be easy to implement and would make a good first PR P-low Low priority issues labels Jun 29, 2020
@robinmoussu
Copy link
Author

Each time a new version it published, two set of pages would be generated:

This is not a workable solution, it would double our storage costs.

It wouldn't. The latest folder would be overridden each time. N + 1 version would be stored at any given time with N the number of release and the +1 being the latest folder. Additional, any $version folder could probably be stored in a compressed form since they would be accessed much less (assuming people nearly always uses the latest version).

I think this is the right approach, and the version is already included in the heading anyway. Regarding search engines, we could include a rel=canonical link in every page:

  • If you search something for a given version, does the link location of any links points itself to the same version?
  • If you search something for the latest version, and copy-paste the link location of a given link to some external places (like stack-overflow), then a newer version is released, is the link in stack-overflow pointing to the newest version or the original version?

@pietroalbini
Copy link
Member

Ok, it shouldn't be too hard to implement then. I don't think we want to make /latest/ the canonical page though since it will be constantly changing, instead /{version} should be the canonical page.

Hmm? There is supposed to be only a single canonical version of each page.

It wouldn't. The latest folder would be overridden each time. N + 1 version would be stored at any given time with N the number of release and the +1 being the latest folder. Additional, any $version folder could probably be stored in a compressed form since they would be accessed much less (assuming people nearly always uses the latest version).

Well the easier solution here is to just treat latest as an alias for the latest version when querying stuff from the underlying storage.

If you search something for a given version, does the link location of any links points itself to the same version?

I'm not sure I follow the question: if we implement rel="canonical" users will only find the documentation for the latest version on search engines.

If you search something for the latest version, and copy-paste the link location of a given link to some external places (like stack-overflow), then a newer version is released, is the link in stack-overflow pointing to the newest version or the original version?

Well, it will point to the latest version, as you're visiting /latest/.

@Kixiron
Copy link
Member

Kixiron commented Jun 30, 2020

I think it's also worth mentioning that when you go to old docs there's a Go to latest version button in the top left that does exactly that. If anything, having what's essentially an eternally changing link linked in old SO posts will hurt users trying to find answers since the new docs aren't guaranteed to have the same interfaces as the old, so links to latest that are old will frequently bring you 404's with no way of figuring out what the user was actually trying to help you with

@jyn514 jyn514 added S-needs-design Status: There's a problem here, but no obvious solution; or the solution raises other questions and removed E-easy Effort: Should be easy to implement and would make a good first PR labels Jul 6, 2020
@workingjubilee
Copy link
Member

When appropriate I frequently actually deliberately forge links without the version in the URL because I want my link to hit the redirect and thus the latest version.

"Don't implement a path because it might 404 later" cuts against all existing paths on docs.rs as well.

@Nemo157
Copy link
Member

Nemo157 commented Aug 27, 2020

One option to improve the 404 situation might be to redirect to a search page if using a non-exact version and the page does not exist in the resolved version, similar to how "go to latest version" works (which would then make "go to latest" act pretty much the same as redirecting to s/<current version>/*/).

@robinmoussu
Copy link
Author

Another one would be to have a link that automatically redirect to the latest version, unless the page doesn't exists anymore, and then it would give the original page.

Something like "https://docs.rs/{crate}/latest/{path}?original_version=1.0.38" or "https://docs.rs/{crate}/latest/{path}/1.0.38".

@jyn514
Copy link
Member

jyn514 commented Nov 13, 2021

I'm going to close this as a duplicate of #1438.

@jyn514 jyn514 closed this as completed Nov 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-frontend Area: Web frontend P-low Low priority issues S-needs-design Status: There's a problem here, but no obvious solution; or the solution raises other questions
Projects
None yet
Development

No branches or pull requests

6 participants