-
Notifications
You must be signed in to change notification settings - Fork 3.7k
fix(bash): correctly highlight doctags in comments again (#4234) #4239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
} | ||
} | ||
); | ||
const COMMENT = hljs.COMMENT(/(?<=^|\s)#/, /$/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use look-behind until v12 because it's a breaking change in older Safari.
How did we do this previously? I also think we do not want any spacing in front of the comment to be scoped as part of the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use look-behind until v12 because it's a breaking change in older Safari.
Yeah, I thought as much. This is not a particularly pressing bug for me so I’d be fine waiting for whenever that happens, unless we can find another way to implement this first.
How did we do this previously?
Prior to d78749a it was just using hljs.HASH_COMMENT_MODE
directly, but that doesn’t handle the whitespace requirement.
I also think we do not want any spacing in front of the comment to be scoped as part of the comment.
Agreed; if it was OK to scope it as part of the comment we wouldn’t need the lookbehind, the regex could just be /(^|\s)#/
. There are also some existing test cases which cover this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was breaking if we just matched it as /#/
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #3918 and associated tests:
highlight.js/test/markup/bash/not-comments.expect.txt
Lines 1 to 3 in d21e285
<span class="hljs-built_in">echo</span> asdf#qwert yuiop | |
<span class="hljs-built_in">echo</span> asdf <span class="hljs-comment">#qwert yuiop</span> |
For example, echo foo#bar
will print foo#bar
; the #bar
part is not a comment.
Presumably the author of that change had a specific scenario in mind, but I went looking for some real-world examples in the Oil Shell test corpus, and instead found another example of something that it has broken the highlighting for:
branch_name=HEAD ;# detached
...which ought to be highlighted as a comment (since it's not inside a command), even though it doesn't have whitespace before it.
So, while that change was well-meaning, on closer inspection I think very careful reading of POSIX §2.3 would be needed to figure out what the actual requirements are here; at a glance I see at least |;<>&
; sometimes ()
can also be immediately followed by a comment, but (I think) only when indicating a standalone subshell; I haven’t looked at it closely but I think in the general case you may need a full-on stateful parser to determine exactly where in the grammar you are and what should be done with a #
character in that state.
Given that, this is maybe a question of determining which kinds of brokenness are most acceptable? I certainly don't figure like wrangling the grammar to make sure that every edge case is covered correctly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also cc'ing @iFreilicht as author of that PR in case they have any more insight into their use case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wolfgang42 Thank you! The important part for me is that a command like nix run nixpkgs#btop
should be correctly highlighted, i.e. #btop
is not a comment in this case, and nixpkgs#btop
is a single token. POSIX-compatible shells require a comment to be at the beginning of a line or be preceeded by whitespace, or at least I thought that was the whole of it at the time.
Changes
Previously, because whitespace needed to be detected as part of the comment, but not highlighted that way, the match was split and only the second part of the match was assigned a scope. However, for reasons I couldn’t figure out, it seems that the presence of multiple
scope
values breaks thecontains
highlighting logic for a mode. (Or something to that effect.)Rather than resolving whatever’s causing that problem, this PR changes the regex to use a lookbehind assertion, thus requiring the whitespace to be there without capturing it as part of the match.
I think this is a breaking change: #3890 says that negative lookbehinds aren’t allowed in Highlight.js until v12 because of Safari support; it seems like the same is true of positive lookbehind, in which case merging this requires a major version bump.
Closes #4234.
Checklist
CHANGES.md
(Full disclosure: this patch was prepared with the assistance of an LLM. However I have prepared this PR description myself, and am confident that the changes are correct.)