Skip to content

Conversation

cpmsmith
Copy link
Contributor

@cpmsmith cpmsmith commented Oct 9, 2025

This is a follow-up to #2569, addressing issues with the standard library scraper I mentioned here.

There are two main fixes:

  1. Names: all pages except for modules' index pages ignored what module they were from, and were just prepended with std::. This meant there were 13 pages named std::Iter, about structs named Iter from different modules. It also meant that things outside modules, e.g. primitive types, were prefixed with std::, naming the page on bool as std::bool, although it can't be referenced that way in code. This also means that there are two pages named std::char - one for the module, and one for the primitive type char.

    This prefixes everything in a module with that module's path, and does not prefix primitives. It also includes submodules in the path.

    For example:

    Before After
    std::fn fn
    std::Iter std::option::Iter
    std::MetadataExt std::os::linux::fs::MetadataExt
  2. Types: almost everything was filed in std, with the exception of modules' index pages and primitive types. This meant there were over 30,000 pages in the std type, and many types for modules with only one page in them.

    This creates types for each module which include all submodules, and files anything not in a module, e.g. primitive types, in std.

    For example:

    Before After
    std::bool / std::bool std / bool
    std / std::Iter std::option / std::option::Iter
    std / std::MetadataExt std::os / std::os::linux::fs::MetadataExt

Or, in screenshots – before:

image image

and after:

image image

@cpmsmith cpmsmith requested a review from a team as a code owner October 9, 2025 17:57
Two main fixes:

1. Names: all pages except for modules' index pages ignored what module
   they were from, and were just prepended with `std::`. This meant
   there were 13 pages named `std::Iter`, about structs named `Iter`
   from different modules. It also meant that things outside modules,
   e.g. primitive types, were prefixed with `std::`, naming the page on
   `bool` as `std::bool`, although it can't be referenced that way in
   code. This also means that there are two pages named `std::char` -
   one for the module, and one for the primitive type `char`.

   This prefixes everything in a module with that module's path, and
   does not prefix primitives. It also includes submodules in the path.

   For example:
       std::fn          → fn
       std::Iter        → std::option::Iter
       std::MetadataExt → std::os::linux::fs::MetadataExt

2. Types: almost everything was filed in `std`, with the exception of
   modules' index pages and primitive types. This meant there were over
   30,000 pages in the `std` type, and many types for modules with only
   one page in them.

   This creates types for each module which include all submodules, and
   files anything not in a module, e.g. primitive types, in `std`.

   For example:
       std::bool / std::bool  → std / bool
       std / std::Iter        → std::option / std::option::Iter
       std / std::MetadataExt → std::os / std::os::linux::fs::MetadataExt
@cpmsmith cpmsmith force-pushed the rust-name-and-type-improvements branch from e11c68c to fbb5e61 Compare October 9, 2025 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant