doc comment revisions: headings, lists, and links #48305
Replies: 27 comments 125 replies
-
Great to see this overview, thanks! Generally: In such an ecosystem, there is a natural drive to diversify formats, which I do not think is good for Go. However, perhaps it would be fruitful to think of the doc format for the next 10 years as serving a wider set of tools than godoc.org. To me, the principal of "to make [docs] readable as ordinary comments when viewing the source code directly" is not bad because different tools can present the information without needing to delve into rapidly evolving formats or formats with several variants. Specifically: Looking forward to seeing how this evolves, and personally hoping something a little bit nicer than ascii for math can be included (though unicode math support in editors, eg … ⊧ ∃x ax²+bx+c seems to help a little). |
Beta Was this translation helpful? Give feedback.
-
Issue #38056 may be worth considering, if it hasn't already been, as it has potential intersection with what's described in the "Links to URLs" section. To add some historical background, it appears godoc.org had automatic RFC linking since its initial commit in 2014, with additional enhancements (golang/gddo#276, golang/gddo#282) in 2015. |
Beta Was this translation helpful? Give feedback.
-
A couple of the documentation websites automatically link to RFCs (godocs, pkg.go). Perhaps |
Beta Was this translation helpful? Give feedback.
-
The ToHTML Config should let you specify the level of heading tag to use, which is important for screen reader UX. |
Beta Was this translation helpful? Give feedback.
-
Because imports are scoped to a file, should this say "if the current file imports" instead of "the current package imports"? |
Beta Was this translation helpful? Give feedback.
-
How exactly would GoDoc handle links that have the same inner text, such as the
The first paragraph contains 2 links that have the same inner text, so there I think having an additional syntax like |
Beta Was this translation helpful? Give feedback.
-
For assumed package names, I also (lightly) propose handling a few other common corner cases that I've noticed. These usually come up as the names for GitHub repositories (to avoid conflicts and make the language clear), and so if there are any Go source files at the root of the repository, the assumed package names would be invalid identifiers.
(we have had to write some of this down in our style guide since this has come up a few times, though this is at least partly our own fault, as we have Go source files at the root of the baseplate.go repository, following the prior convention from baseplate.py) |
Beta Was this translation helpful? Give feedback.
-
Is there a reason for choosing the link digraph sets with their low false positive rate over the essentially zero false positive rate with the lighter weight on the page |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
-
Interfaces, please, let's make them readable in the doc. |
Beta Was this translation helpful? Give feedback.
-
👋🏼 , Super nice to see this topic. In Packer, we often For links, we use the markdown synthax: |
Beta Was this translation helpful? Give feedback.
-
Markdown is the de facto standard solution for the problems you're trying to solve here. It certainly has flaws, but in practice they're solvable without major effort. And the benefits of using a more-or-less universally understood syntax hugely outweigh those costs. Go shouldn't spend an innovation token on documentation syntax. It's a solved problem, and the marginal improvements offered by alternatives don't outweigh their concomitant toil. —
To respond to each point...
That's just how these things go 🤷 and while it definitely introduces complexity to the parsers and renderers, it's not intractable. (GitHub-flavored Markdown is what you should be targeting, for the record. It's by far the most widely used.)
It seems like all of these issues can be solved individually without too much fuss. You would end up forking GitHub-flavored Markdown and creating Go-flavored Markdown, or something like that. That's an ideal outcome.
This is subjective, of course, so I'm not sure how much weight you can give to it in the decision-making. Furthermore, most of the enumerated feature requirements are core Markdown stuff, which means basically all Go programmers already know how to read and write them. (Thanks, GitHub!) That usability surely trumps aesthetics in the calculus.
It's true, and in some sense a shame, that Markdown lets you express 1 thing in N ways. IMO, a Go-flavored Markdown could reasonably elide a lot of those "features" — I wouldn't miss HTML elements, or alternate demarcations for headings, for example. Maybe some people would say that it isn't actually Markdown without those features. Fair enough. But that's fine, there's no standards body involved here. Just a point of feedback. Take it as you will. |
Beta Was this translation helpful? Give feedback.
-
Minor point on links to Go documentation: I wonder if it's feasible to allow links surrounded by spaces (in addition to punctuation/linebreaks), so you can do e.g.
Of course |
Beta Was this translation helpful? Give feedback.
-
I have a possibly useful hypothetical situation I mind: Consider an IDE that is cramming documentation into a hovering tooltip. I generally hate this but people have preferences … There’s a particularly strict approach here that might suggest even the smallest amount of poorly handled inline markup leads to a failure to correctly display documentation. I tend to read this way - if the docs have any garbage, my brain instantly switches to solving the garbage problem rather than reading the doc. (Fans of a particular IDE won’t attribute the garbage to the IDE, either…) In the IDE hypothetical, what are the chances and burdens of implementation versus displaying plain text? My reactionary take: the plain text must be pristine. ‘-‘ lists seem reasonable. Links should be more like footnotes - at the bottom, not inline. Emitting a very light AST prior to e.g. HTML, go doc terminal output seems pragmatic but the plain text should always be the assumed mode of rendering. |
Beta Was this translation helpful? Give feedback.
-
I believe the plan for continuation lines in lists will be widely condemned as aesthetically ugly. Continuation lines should start in the same column as the text of the first line after the marker, thus:
not
It may be that keeping the four-space indentation of continuation lines is valuable for Markdown compatibility. But if so, the initial line should be formatted to match:
This approach may be particularly useful when dealing with numbered lists, whose numbers may not all have the same length:
Also, as a minor note, the proposal specifies that numbered list items can start with any decimal number, but I think in practice it would need to be a non-negative number. |
Beta Was this translation helpful? Give feedback.
-
If we're going with the route of For usability, I'd still be able to use the |
Beta Was this translation helpful? Give feedback.
-
I like this a lot, particularly how simple and close to the current documentation format it is (definitely agree about not importing all of Markdown). Regarding numbered lists: "Item numbers are left as is, never renumbered (unlike Markdown)." What if they have a list with just "2." and "4." in it (or even out-of-order items)? I was unsure how this would be done in HTML, but it looks like there's an explicit |
Beta Was this translation helpful? Give feedback.
-
I have noticed that the feature set matches what is used in the Gemini mark-up. I am not suggesting to start using Gemini markdown; even I would love that. |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
-
One concern with allowing lists inside pre blocks is that there may be cases where you want to write a pre block that starts with a bullet list marker but don't want it converted to a list. For example sometimes I incorporate some "ASCII art":
This would accidentally start a list I think? Especially combined with gofmt rewriting comments, I feel that pre blocks really should be left alone without exceptions. I fear that anything else will introduce some really unexpected and surprising behaviour in various edge cases. Overall I feel that long-term design should be (strongly) prioritized over short-term compatibility. Nothing will break if pre-block lists are just treated identical as they are now. This has worked well enough for over ten years, and while some updating may be required to make existing comments better by taking advantage of the new syntax – which is never great – it also means we'll still be "stuck" with these short-term compatibility rules in ten years time. Similarly, I feel having just one bullet marker would be better as it's more consistent with Go/gofmt's general design aesthetic of "one correct way to write/format it". I don't really care which as such. I appreciate the short-term compatibility advantages and that there's no clear "winner" on which character to use, but it makes Go better in the long term. |
Beta Was this translation helpful? Give feedback.
-
I'd like to address the explicit omission of illustrations as proposed in #39513
and still accomplish the primary goals of simplicity and readability.
Finally, under the the new proposal, what is the behavior of:
|
Beta Was this translation helpful? Give feedback.
-
Having sub-lists be defined would also be very convenient. Maybe sometime in the future, maybe now. I have, a couple of times, needed to explain certain guarantees about the return value of a function, and I annotated this by using lists for guarantees, and billeted descriptions to these guarantees in sub-lists. |
Beta Was this translation helpful? Give feedback.
-
Most of the other proposed gofmt cleanups are neutral w.r.t. processing documentation with old tools. This one would convert headings into non-headings from the point of view of existing software. Maybe this is acceptable since use of headings seems fairly rare, and it's not going to make a particularly noticeable difference to how documentation is formatted in the terminal. |
Beta Was this translation helpful? Give feedback.
-
Should we support linking to a heading declared in the package-level documentation from a specific declaration? For example, lets say my package-level godoc says: // Package hujson contains a parser and packer for the HuJSON format.
// Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
// tempor incididunt ut labore et dolore magna aliqua.
//
//
// Use with the Standard Library
//
// Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
// tempor incididunt ut labore et dolore magna aliqua.
// Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
// ut aliquip ex ea commodo consequat.
package hujson This contains a heading called "Use with the Standard Library". Elsewhere in the godoc for a declaration, I would like to do: // Deprecated: Do not use. See [Use with the Standard Library] for alternatives.
func Unmarshal([]byte, interface{}) error |
Beta Was this translation helpful? Give feedback.
-
Probably worth reading https://swift.org/blog/swift-docc/ to see if they have ideas worth stealing. |
Beta Was this translation helpful? Give feedback.
-
I am looking into the possibility of revising Go's doc comment syntax, specifically adjusting headings and adding lists and links. This discussion is meant to gather feedback before writing an official proposal.
The current Go doc comment format has served us well since their introduction in 2009. There has only been one significant change, which was the addition of headings in 2011. But there are also a few long-open issues and proposals about doc comments, including:
It makes sense, as we approach a decade of experience, to take what we've learned and make one coherent revision, setting the syntax for the next 10 or so years.
Goals and non-goals
The primary design criteria for Go doc comments was to make them readable as ordinary comments when viewing the source code directly, in contrast to systems like C#'s Xmldoc, Java's Javadoc, and Perl's Perlpod. The goal was to prioritize readability, avoiding syntactic ceremony and complexity. This remains as a primary goal.
Another concern, new since 2009, is backwards compatibility. Whatever changes we make, existing doc comments must generally continue to render well. Less important, but still something to keep in mind, is forward compatibility: keeping new doc comments rendering well in older Go versions, for a smoother transition.
Another goal for the revamp is that it include writing a separate, standalone web page explaining how to write Go doc comments. Today that information is squirreled away in the doc.ToHTML comment and is not easily found or widely known.
Within those constraints, the focus I have set for this revamp is to address the issues listed above. Specifically:
Make the header syntax more predictable. The headings rule is clearly difficult to remember and has too many false negatives. But further adjustments of the current rule run the risk of false positives.
Add support for lists. There are many times in documentation when a bullet or numbered list is called for. Those appear in many doc comments today, as indented <pre> blocks.
Add support for links to URLs. Today the only way to link to something is by writing the URL directly, but those can sometimes be quite unreadable and interrupt the text.
Add support for links to Go API documentation, in the current package and in other packages. This would have multiple benefits, but one is the ability in large packages to write top-level doc comments that give a good overview and link directly to the functions and types being described.
I believe it also makes sense to add another goal:
It is not a goal to support every possible kind of documentation or markup. For example:
Plain text has served us very well so far, and while some might prefer that comments
allow
fontchanges
, the syntactic ceremony and complexity involved seems not worth the benefit, no matter how it is done.People have asked for support for embedding images in documentation (see #39513), but that adds significant complexity as well: image size hints, different resolutions, image sets, images suitable for both light and dark mode presentation, and so on. It is also difficult (but not impossible) to render them on the command line. Although images clearly have important uses, all this complexity is in direct conflict with the primary goal. For these reasons, images are out of scope. I also note that C#'s Xmldoc and Perl's Perlpod seem not to have image support, although Java's Javadoc does.
Markdown is not the answer, but we can borrow good ideas
An obvious suggestion is to switch to Markdown; this is especially obvious given the discussion being hosted on GitHub where all comments are written in Markdown. I am fairly convinced Markdown is not the answer, for a few reasons.
First, there is no single definition of Markdown, as explained on the CommonMark site. CommonMark is roughly what is used on GitHub, Reddit, and Stack Overflow (although even among those there can be significant variation). Even so, let's define Markdown as CommonMark and continue.
Second, Markdown is not backwards compatible with existing doc comments. Go doc comments require only a single space of indentation to start a <pre> block, while Markdown requires more. Also, it is common for Go doc comments to use Go expressions like `raw strings` or formulas like a*x^2+b*x+c. Markdown would instead interpret those as syntactic markup and render as “
raw strings
or formulas like ax^2+bx+c”. Existing comments would need to be revised to make them Markdown-safe.Third, many features in Markdown are not terribly readable. The basics of Markdown can be simple and punctuation-free, but once you get into more advanced uses, there is a surfeit of notation which directly works against the goal of being able to read (and write) program comments in source files without special tooling. Markdown doc comments would end up full of backquotes and underscores and stars, along with backslashes to escape punctuation that would otherwise be interpreted specially. (Here is my favorite recent example of a particularly subtle issue.)
Fourth, Markdown is surprisingly complex. Markdown, befitting its Perl roots, provides more than one way to do just about anything: _i_, *i*, and <em>i</em>; Setext and ATX headings; indented code blocks and fenced code blocks; three different ways to write a link; and so on. There are subtle rules about exactly how many spaces of indentation are required or allowed in different circumstances. All of this harms not just readability but also comprehensibility, learnability, and consistency. The ability to embed arbitrary HTML adds even more complexity. Developers should be spending their time on the code, not on arcane details of documentation formatting.
Of course, Markdown is widely used and therefore familiar to many users. Even though it would be a serious mistake to adopt Markdown in its entirety, it does make sense to look to Markdown for conventions that users would already be familiar with, that we can tailor to Go's needs. If you are a fan of Markdown, you can view this revision as making Go adopt a (very limited) subset of Markdown. If not, you can view it as Go adopting a couple extra conventions that can be defined separately from any Markdown implementation or spec.
Headings
The current rule is:
I can never remember the details of this exact rule, despite having chosen it. Every time I write a heading, I worry about whether it's going to be recognized as such. Others clearly have the same problem (#7349, #31739, #34377). The rule avoided the need for visible syntax, but in retrospect visible syntax would have been simpler. As Markdown shows us, that syntax can be very lightweight: a single “#” would suffice. Therefore I suggest the following:
New Rule: If a span of non-blank lines is a single line beginning with # followed by a space or tab and then additional text, then that line is a heading.
Here are some examples of variations that do not satisfy the rule and are therefore not headings:
Transition: The old heading rule will remain valid, which is acceptable since it mainly has false negatives, not false positives.
Gofmt will rewrite old-style headings into new-style headings, so that the fact of being a heading is made clearer to readers.
Lists
There is no support for lists today. As noted before, documentation needing lists uses indented <pre> blocks instead.
For example, here are the docs for cookiejar.PublicSuffixList:
And here are the docs for url.URL.String:
Ideally, we'd like to adopt a rule that makes these into bullet lists without any edits at all. (Markdown's space-counting rules would make these <pre> blocks, not lists.)
Today, a span of lines all indented by one or more spaces or tabs is always a <pre> block.
I suggest the following:
New Rule: In a span of lines all blank or indented by one or more spaces or tabs (which would otherwise be a <pre> block),
if the first indented line begins with a bullet list marker or a numbered list marker,
then that span of indented lines is a bullet list or numbered list.
A bullet list marker is a dash, star, or plus followed by a space or tab and then text.
In a bullet list, each line beginning with a bullet list marker starts a new list item.
A numbered list marker is a decimal number followed by a period or right parenthesis, then a space or tab, and then text.
In a numbered list, each line beginning with a number list marker starts a new list item.
Item numbers are left as is, never renumbered (unlike Markdown).
Using this rule, the two doc comments above are both recognized and formatted as bullet lists, not as <pre> blocks.
Note that the rule means that a list item followed by a blank line followed by additional indented text continues the list item (regardless of comparative indentation level):
Note also that there are no code blocks inside list items—any indented paragraph following a list item continues the list item, and the list ends at the next unindented line—nor are there nested lists. This avoids all of the space-counting subtlety of Markdown.
To re-emphasize, a critical property of this definition of lists is that it makes existing doc comments written with pseudo-lists turn into doc comments with real lists.
Transition: Gofmt will rewrite recognized bullet and numbered lists to use a standard format. For example, the two doc comments above would reformat to:
and:
The specific formatting rules are discussed in the Formatting section below.
Markdown recognizes three different bullets: -, *, and +. In the main Go repo, the dash is dominant: in comments of the form
`//[ \t]+[-+*] `
(grepping, so some of these may not be in doc comments), 84% use -, 14% use *, and 2% use +. In a now slightly dated corpus of external Go code, the star is dominant: 37.6% -, 61.8% *, 0.7% +.Markdown also recognizes two different numeric list item suffixes: “1.” and “1)”. In the main Go repo, 66% of comments use “1.” (versus 34% for “1)”). In the external corpus, “1.” is again the dominant choice, 81% to 19%.
We have two conflicting goals: handle existing comments well, and avoid needless variation. To satisfy both, all three bullets and both forms of numbers will be recognized, but gofmt (see below) will rewrite them to a single canonical form: dash for bullets, and “N.” for numbers. (Why dashes and not asterisks? Proper typesetting of bullet lists sometimes does use dashes, but never uses asterisks, so using dashes keeps the comments looking as typographically clean as possible.)
Links to URLs
Documentation is more useful with clear links to other web pages. For example, the encoding/json package doc today says:
There is no link to the actual RFC 7159, leaving the reader to Google it. And the link to the “JSON and Go” article must be copied and pasted. Loosely following the Markdown shortcut reference link format, I suggest the following:
New Rule: A span of unindented non-blank lines defines link targets when each line is of the form “[Text]: URL”. In other text, “[Text]” represents a link to URL using the given text—in HTML, <a href="URL">Text</a>.
For example:
Note that the link definitions can only be given in their own “paragraph” (span of non-blank unindented lines), which can contain more than one such definition, one per line. If there is no corresponding URL declaration, then (except for doc links, described in the next section) the text is not a hyperlink, and the square brackets are preserved.
This format only minimally interrupts the flow of the actual text, since the URLs are moved to a separate section. As already noted, it also roughly matches the Markdown shortcut reference link format, without the optional title text.
Transition: Gofmt will move link definitions to the end of the overall doc comment. Go vet will flag unused link targets. Older versions of Go will show the text verbatim, which is fairly readable.
Links to Go API documentation
Documentation is also more useful with clear links to other documentation, whether it's one function linking to another, preferred version or a top-level doc comment summarizing the overall API of the package, with links to the key types and functions. Today there is no way to do this. Names can be mentioned, of course, but users must find the docs on their own.
Following discussion on #45533, I suggest to treat doc links like the links in the previous section, without target definitions. Specifically:
New Rule: Doc links are links of the form “[Name1]” or “[Name1.Name2]” to refer to exported identifiers in the current package, or “[pkg]”, “[pkg.Name1]”, or “[pkg.Name1.Name2]” to refer to identifiers in other packages.
In the second form, “pkg” can be either a full import path or the assumed package name of an existing import.
The assumed package name is either the identifier in a renamed import or else the name assumed by goimports. (Goimports inserts renamings when that assumption is not correct, so this rule should work for essentially all Go code.)
A “pkg” is only assumed to be a full import path if it starts with a domain name (a path element with a dot) or is one of the packages from the standard library (“[os]”, “[encoding/json]”, and so on).
To avoid problems with maps, generics, and array types, doc links must be both preceded and followed by punctuation, spaces, tabs, or the start or end of a line.
For example, if the current package imports encoding/json, then “[json.Decoder]” can be written in place of “[encoding/json.Decoder]” to link to the docs for encoding/json's Decoder.
The implications and potential false positives of this implied URL link are presented by Joe Tsai here. In particular, the false positive rate appears to be low enough not to worry about.
To illustrate the need for the punctuation restriction, consider:
versus
and
Transition: Older versions of Go will show the text verbatim, which is still fairly readable.
Formatter
Along with the changes, I suggest we add to go/doc a function
that reformats a doc comment in the conventional presentation, and then to have go/printer and gofmt invoke this formatter.
The formatter would canonicalize the input so that it formatted exactly as before but with the following properties:
defined to render as “ and ” are replaced with those.
The formatter would not reflow paragraphs, so as not to prohibit use of the semantic linefeeds convention.
This canonical formatting has the benefit for Markdown aficionados of being compatible with the Markdown equivalents. The output would still not be exactly Markdown, since various punctuation would not be (and does not need to be) escaped, but the block structure Go doc comments and Markdown have in common would be rendered as valid Markdown.
Additional API
The current doc.ToHTML is given only the comment text and therefore cannot implement import-based links to other identifiers. To address this, we would need to add ToHTML and ToText methods to the Package type, and define that the top-level functions are as though calling the methods on a zero value of the struct. The ToHTML method will need to take a new Config struct that, at the least, allows specifying the URL prefix of the documentation server (for example,
/
orhttps://pkg.go.dev/
orhttps://golang.org/pkg/
). It would also need to specify the HTML tag for headings, which will depend on the surrounding page where the docs will be presented.There is an accepted proposal to add doc.ToMarkdown for easy conversion of Go doc comments to Markdown, and we would implement and update that as part of this work. It too would be added to the Package type.
Beta Was this translation helpful? Give feedback.
All reactions