Skip to content

Rustdoc: Option to insert rel="canonical" links to docs.rs (deduplicate search engine results) #143139

@laundmo

Description

@laundmo

Currently, when rustdoc output is hosted publicly, docs for the entire dependency tree may get indexed by search engines and for some searches even preferred over official docs for that crate.

I propose a rustdoc config option and/or command line flag which generates <link rel="canonical" href="..."> links as part of the head for all dependency items. This should hint to search engines that they should prefer the docs.rs version in search results.

Motivation

I've come across this issue mainly with Bevy, where some plugin authors choose to host their own docs, including the entire dependency tree.

I'm aware of the following sites hosting alternate versions of the bevy docs:

Searches which have resulted in one of these pages being prioritized (using Google, some also on Ecosia):

  • bevy Plane3d
  • bevy VolumetricFogPlugin (second, first was a docs.rs /?search= link)
  • bevy FilteredAccessSet (second, first was a docs.rs /?search= link)
  • bevy avian debug render (second, first is Avian repo)

While alternate docs appear on the first page of Bing and DDG sometimes, I've not found any searches where they get ranked above docs.rs. Note that my results may also be highly unstable.

Especially for Bevy, which has done some rather useful customization to its docs, like highlighting core traits at the top, this is not ideal.

Even if this is a temporary issue, and all we'd have to do is wait, i think it would still make sense to have an option which emits these.

Details

Canonical links work by telling search engine the main URL they should use for a site. They should be included in the <head> of a page.

Rustdoc would need to decide which crates this applies to. Here's the conditions I can imagine would make sense to check:

  • Don't apply to any crate in the workspace
  • Don't apply to git or path dependencies (?)

It might be worth considering whether there should be an option to always generate canonical links, as that could be quite useful as a less extreme option for development docs (unreleased changes) than disallowing any crawling using robots.txt

With that in mind, the option could look like this:
--insert-canonical: Inserts rel="canonical" links in header pointing to docs.rs
When rendering HTML files, this will insert <link rel="canonical" href="https://docs.rs/<crate>/latest/<crate>/<path_to_item>"> links, hinting to search engines that they should prefer docs.rs over this.
Possible values:

  • never: Default, don't generate any canonical links
  • deps: Only generate canonical links for crates.io dependencies
  • always: Always generate canonical links to docs.rs

Another open question is if the default should change to always generating canonical links for deps, to reduce the need to ask individual people to set this.

Alternatives

I've considered these alternative approaches, but dismissed them for the provided reasons:

  • Asking projects to use --html-in-header
    • AFAIK not possible on a per-crate basis
    • Doesn't have any logic which could be used to make it only apply to dependencies
  • Asking projects to use --no-deps
    • This reduces the utility of the documentation, as it will make it harder to see certain parts (incomplete list of types implementing a Trait)
    • Makes docs generally less useful
  • Asking projects to exclude/hide dependencies from crawlers using robots.txt or redirects
    • Annoying to create robots.txt, as it would need to list all dependencies individually
    • More difficult, depending on the web host (github pages)
    • A lot of effort compared to a commandline flag, making adoption unlikely

Metadata

Metadata

Assignees

Labels

C-feature-requestCategory: A feature request, i.e: not implemented / a PR.T-rustdocRelevant to the rustdoc team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions