Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linkcheck does not traverse into docstrings #2329

Closed
goerz opened this issue Nov 3, 2023 · 4 comments · Fixed by #2330
Closed

Linkcheck does not traverse into docstrings #2329

goerz opened this issue Nov 3, 2023 · 4 comments · Fixed by #2330

Comments

@goerz
Copy link
Member

goerz commented Nov 3, 2023

When makedocs is called with linkcheck=true, the Documenter.linkcheck routine does not traverse into DocsNode. That is, broken links inside of docstrings will not be reported.

The reason for this is

function linkcheck(node::MarkdownAST.Node, doc::Document; method::Symbol=:HEAD)
node.element isa MarkdownAST.Link || return

Since the node.element for a docstring is a DocsNode, nothing happens.

To fix this, linkcheck(node::MarkdownAST.Node, doc::Document; method::Symbol=:HEAD) should receive element=node.element as an explicit second parameter, so that we can dispatch on that. Other places in Documenter follow the same approach, e.g., the domify function in the HTMLWriter.

An alternative solution would be for AbstractTrees.PreOrderDFS to traverse into DocsNode, but that might be a lot trickier. Besides, DocumenterCitations has the exact same problem, and there either linkcheck would have to dispatch for elements of type BibliographyNode or AbstractTrees.PreOrderDFS would have to traverse into BibliographyNode.

I would maybe also refactor Documenter.linkcheck a little bit. Right now it is difficult to unit-test. The call to curl should be probably be isolated more into a linkcheck(url::String) method that returns (status, scheme, location): I don't think that network access on Github workflow runners (or anywhere else the test suite runs) is necessarily robust. For unit-testing, you'd want to mock the call to curl and use a pre-defined (status, scheme, location) for a given url.

We could easily add a cache keyword argument to linkcheck(doc) and linkcheck(node, doc; method=:HEAD) (which should be linkcheck(node, element, doc; method=:HEAD), as discussed above). This cache would a Dict mapping url::String => (status, scheme, location). It would be internal use only (specifically for unit testing, that is). If a url is in the cache (set up in a test), we'd just use the cached values instead of calling linkcheck(url) that goes out to curl and the network.

@goerz
Copy link
Member Author

goerz commented Nov 3, 2023

The first commit 47c31df in #2330 illustrates the issue

@goerz
Copy link
Member Author

goerz commented Nov 3, 2023

The call to curl should be probably be isolated

Or maybe, instead of all the refactoring, we can just put a "fake" curl executable in the PATH during the test that just gives predefined responses without acutally going to the network

@mortenpi
Copy link
Member

mortenpi commented Nov 6, 2023

Two notes about testing and refactoring here:

  1. We run the "online" tests as part of a separate workflow:

    name: "Linkcheck: online tests"
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - uses: julia-actions/setup-julia@latest
    with:
    version: '1'
    - name: Install dependencies
    run: julia --color=yes --project -e'using Pkg; Pkg.instantiate()'
    - name: Run online linkcheck tests
    run: julia --color=yes --project test/online_linkcheck.jl

    The online tests are in this file. The idea being that normal tests shouldn't try to access the network. Also helpful for getting Documenter to work on PkgEval etc.

  2. We should replace curl with Downloads.jl anyway (Use LibCurl.jl for curl binary when checking links #1601). Doesn't fully resolve the testing problem. I'd be happy with some Mocking.jl-type stuff. Or I think we could also run a local web server with HTTP.jl that returns the necessary responses, for more "realistic" unit testing.

@goerz
Copy link
Member Author

goerz commented Nov 6, 2023

Ah, I missed online_linkcheck.jl (and, that it runs separately). I agree, I should move it over there.

Replacing curl with Downloads.jl sounds like a good plan for the longer term, and maybe disentangling the network access from the main linkcheck function for easier testing would be something to do as part of that. I'll hold off on any refactoring for now, though.

For what it's worth, for the equivalent test in DocumenterCitations I tried out a mock curl executable that the test temporarily adds to the PATH to override the system curl. That turns out to work pretty well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants