PML 2.0: Attributes Challenges #30

tajmone · 2021-09-07T10:23:23Z

tajmone
Sep 7, 2021
Maintainer

@pml-lang and @celtic-coder, the PML 2.0.0 update has introduced significant breaking changes in the attributes syntax. Here's my ground plan on how to (re)implement attributes in Sublime PML.

I'm also going to explain why, as mentioned in Discussion #56, I believe that the new attributes system is going to be very hard to implement in TextMate like editors (i.e. with RegEx based syntax definitions) other than Sublime Text 4 — and why even if they can be implemented, they won't support smart editing features and/or proper plug-ins support.

The New Attributes Syntax

PML attributes are now enclosed within parenthesis ( ), unless it's a node that only supports attributes, in which case the parenthesis can be omitted (new lenient parsing rules).

Examples:

[image ( source="images/juicy apple.png" width="400" ) ]
[image source="images/juicy apple.png" width="400" ]

As before, attributes must immediately follow the node opening tag, except that now they don't need to be on the same line but can be even defined in subsequent line (the line continuation \ was apparently dropped, although it's not mentioned in the Changelog):

[image
    (source="images/juicy apple.png"
     width="400"
    )]

So, basically as soon as a character other than whitespace (\s) or an opening ( is encountered, we're looking at the node contents (if contents begin with a (, you must add an empty attributes list via ()).

New Attributes Scoping

Before, in Sublime PML we always scoped the "space" between an opening tag and the beginning of its contents as meta.annotation.node-attributes.<node name>.pml (where <node name> would indicate the specific node type) which is what allowed for smart auto-completion suggestions based on awareness of the current node context, and (most important!) to only suggest attributes completions where these are allowed.

With the new syntax, we'll now be needing two attributes scoping, not just one:

meta.annotation.missing-attributes.<node name>.pml
meta.annotation.node-attributes.<node name>.pml

because we must now take into account nodes that could take attributes but don't have any defined, and nodes which already have some attributes, which is going to determine whether the auto-completion should enclose the attribute within ( ) or not.

Example:

[ch [title Some Chapter]]
   ^ meta.annotation.missing-attributes.chapter.pml

[ch (html_class = big) [title Another Chapter]]
     ^^^^^^^^^^^^^^^^ meta.node.missing-attributes.chapter.pml

In the first of the above examples, if I type id in the gap between [ch and [title, the autocompletion should be ( id = someId ) ; whereas in the second example, attribute auto-completions should only be suggested within the (...), and typing id therein should result in id = someId instead, without the parenthesis.

So, you see were I'm getting at: the new syntax demand more complex scoping when it comes to smart editing, for the syntax needs to be aware of the different attributes context, and be able to adapt to editing changes in real time.

New Parsing Strategy

Hence, the new attributes context needs to leverage the new ST4 branching features.

After matching an opening tag that supports attributes, we'll have to first check for the presence of a default attribute without ( ) (if the node supports it), in which case I guess no more parameters are allowed, but if there isn't a lenient default parameter, then we need to set a branching point and start to scan for parameters or node contents, and depending on the results we'll have to set the appropriate meta-scope for attributes auto-completions.

We need a branching point because the syntax parser needs to rewind to it when our attempt to find attributes (either within or without parenthesis) fails. This can no longer be handled via lookahead RegExs, because now attributes are no longer bound to occur immediately after the opening tag, but they can occur on subsequent lines — and TextMate like grammars can only parse one line at the time, so if the editor syntax doesn't support branching points to rewind a failed parsing, this type of smart editing awareness can't be implemented.

New Editors Challenges

Beside LSP language servers, and dedicated editors that implement syntax highlighting via custom code modules (e.g. Scintilla based editors), the majority of general purpose editors today tend to use TextMate like grammars, which define new syntaxes via RegEx based context definitions.

To the best of my knowledge, only ST4 supports syntax branching and parse rewinding, so I think that supporting PML 2.x in editors like VSCode, Atom, TextMate, etc., is going to be rather challenging.

Not only it's quite possible that attributes can't be parsed precisely (due to the new multi-line possibility), especially their new lenient parsing variants, but even if we manage to (via some horrible hacks), we still won't be able to provide those extra meta-scopes that are so vital for smart editing features like context-aware auto-suggestions.

Although many seem to think that syntax highlighting is mostly about colouring syntax elements and constructs in a consistent and meaningful way, there's much more to a syntax definition than that, because semantic scoping (especially the meta scopes, which are not used by colour themes) allows to create smart editing features and plug-ins which can leverage context awareness to enhance the editing experience and add cool features (e.g. linting, refactoring, etc.).

pml-lang · 2021-09-08T07:06:18Z

pml-lang
Sep 8, 2021
Collaborator

See my previous comment.

Parsing was more difficult before version 2. However, the lenient rules were not documented (really sorry for that), therefore it might seem that the rules are more complex now. But they have actually been simplified.

I can only re-iterate my suggestion to start with a plugin that does not support lenient parsing (and clearly state that in the plugin's documentation), because regex-based parsing is just not powerful enough to cover all lenient parsing cases. Maybe more can be achieved with ST4, but it will probably still be really hard (or maybe not possible) to cover all cases, and keep it working in future PML versions.

Smart editing features that work reliably in all current and future PML documents can probably only be achieved with a language server that uses the PML/pXML parser.

In a previous comment you said:

If the parse is now in Java, there should be various solutions to auto-generate a language server then, but it might require defining the PML lang via some BNF like grammar I guess.

Yes, that would be a good approach to investigate. Tools like Xtext or textX (which you mentioned already in another comment) might be the best solution to create a PML language server in a minimum amount of time.

However, I also think that simple plugins for popular editors (limited to strict pXML syntax) would still be very useful.

0 replies

tajmone · 2021-12-26T19:13:47Z

tajmone
Dec 26, 2021
Maintainer Author

PML Syntax Guide

FYI, I've started working on a new document project: the PML Syntax Guide:

https://github.com/tajmone/pml-playground/tree/main/syntax-guide

The latest version of the HTML document can always be accessed via HTML Live Preview:

PML-Syntax-Guide.html

0 replies

pml-lang · 2021-12-27T06:59:45Z

pml-lang
Dec 27, 2021
Collaborator

Great!

Just a question: Did you consider writing the guide in PML? Or is Asciidoc a better choice because it is natively supported in Github?

2 replies

tajmone Dec 27, 2021
Maintainer Author

Just a question: Did you consider writing the guide in PML? Or is Asciidoc a better choice because it is natively supported in Github?

I'm writing it in AsciiDoc because I'll need to syntax highlight PML code using my custom PML syntax for Rouge, which is not supported by PML, and also because I'll need to do some include:: magic from the JSON Tags mustache templates, like CSV tables, etc., which is very practical in Asciidoctor.

pml-lang Dec 27, 2021
Collaborator

Thanks for the answer. That makes sense.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PML 2.0: Attributes Challenges #30

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

PML 2.0: Attributes Challenges #30

Uh oh!

tajmone Sep 7, 2021 Maintainer

The New Attributes Syntax

New Attributes Scoping

New Parsing Strategy

New Editors Challenges

Replies: 3 comments · 2 replies

Uh oh!

pml-lang Sep 8, 2021 Collaborator

Uh oh!

tajmone Dec 26, 2021 Maintainer Author

PML Syntax Guide

Uh oh!

pml-lang Dec 27, 2021 Collaborator

Uh oh!

tajmone Dec 27, 2021 Maintainer Author

Uh oh!

pml-lang Dec 27, 2021 Collaborator

tajmone
Sep 7, 2021
Maintainer

Replies: 3 comments 2 replies

pml-lang
Sep 8, 2021
Collaborator

tajmone
Dec 26, 2021
Maintainer Author

pml-lang
Dec 27, 2021
Collaborator

tajmone Dec 27, 2021
Maintainer Author

pml-lang Dec 27, 2021
Collaborator