Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added spans to AST nodes #373

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Added spans to AST nodes #373

wants to merge 2 commits into from

Conversation

Ertanic
Copy link

@Ertanic Ertanic commented Nov 16, 2024

I needed a parser for .ftl files. I found tree-sitter-fluent, but for some reason it couldn't parse a valid file, throwing errors when trying to use replaceable expressions. Decided to use fluent-syntax, but why does the javascript version have node spans but the rust version does not. This PR solves this issue, but since there is usually no need for spans, I hid them behind the spans feature.

And also to avoid conflicts in tests, because there the tree is formatted, because of which the spans change, the implementation version of PartialEq for AST nodes was divided into a derive implementation and a manual one.

The good idea is to write tests to match the spans, but I'm not sure how best to do that, I need help with this.

@alerque
Copy link
Collaborator

alerque commented Nov 16, 2024

This would address #270, no?

Have you run any benchmarks with/without this feature enabled?

@Ertanic
Copy link
Author

Ertanic commented Nov 16, 2024

This would address #270, no?

I've been looking in only open PRs.

Have you run any benchmarks with/without this feature enabled?

bench default spans feature
construct/preferences 26.619 µs 24.673 µs
resolve/preferences 15.104 µs 14.990 µs
resolve_to_str/preferences 22.803 µs 22.902 µs
parse_ctx_runtime/preferences 300.13 µs 305.19 µs
parse_ctx_runtime/browser 120.29 µs 145.79 µs

Copy link
Collaborator

@zbraniecki zbraniecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all in all, happy to see that! I definitely was hoping we'll gain this feature one day. Thank you for contributing!

My only uber-concern is that you use PartialEq to compare elements with different span. I'm not sure how canonical it is to use PartialEq this way. We wrangled with the meaning and role of PartialEq in ICU4X for a long time and I'm still not sure how to handle it but in ICU4X we decided to stay on the cautious side and introduce a function like cmp_value to allow for comparisons that exclude part of the value.
Span can be thought of as a metadata or part of the element and with the trait use you make it impossible to compare with spans.

I'm not sure what's the right way around it and if canonically ASTs in Rust (or other languages) use, and it is not uncommon to compare skipping spans, I'm fine with doing the same.

fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
@Ertanic
Copy link
Author

Ertanic commented Nov 17, 2024

I wouldn't want to use a manual implementation of PartialEq, but I've found this to be the most optimal solution. I found several solutions on the Internet, including the derivative crate, where it is possible to ignore a particular field when using #[derive(Derivative), derivative(PartialEq)], but I didn't want to drag additional dependencies for the sake of it. Although it is much more convenient, because when you change the structure's composition, you won't have to worry about supporting manual implementation of PartialEq.

Span itself implements PartialEq so that it can be compared to others. I was looking at tree-sitter, where the range of a node is provided through the corresponding function.

And I don't quite understand your point. Are you proposing to introduce additional methods for fields to compare structures and their fields? Or to compare all fields of node structures separately from PartialEq and Eq in separate methods?

Again, it's all for the sake of passing some tests that receive one ftl as input, then serialize it into the formatted ftl format and parse it again, so you get different spans for nodes. Either change the input data of the tests, which I think is wrong, or supplement the serializer so that it builds ftl content by spans, or just separate the implementation of comparison. I don't know, I chose the easiest option, as I needed it urgently in my lsp server, and I don't have any problems with it so far.

@zbraniecki
Copy link
Collaborator

zbraniecki commented Nov 18, 2024

And I don't quite understand your point. Are you proposing to introduce additional methods for fields to compare structures and their fields? Or to compare all fields of node structures separately from PartialEq and Eq in separate methods?

I'm raising a concern that semantically the following code should pass:

let node1 = Node {
  value: "foo",
  span: span!(0, 4),
};

let node2 = Node {
  value: "foo",
  span: span!(5, 11),
};
assert_ne!(node1, node2);

because those two nodes are not equal. Their content is different.

Now, what is true is that in most cases we care about the actual content of the node, not its meta information. We can explicitly achieve that by doing:

let node1 = Node {
  value: "foo",
  span: span!(0, 4),
};

let node2 = Node {
  value: "foo",
  span: span!(5, 11),
};
assert_ne!(node1, node2);

// Option 1:
assert_eq!(node1.content, node2.content);

// Option 2:
assert_eq!(node1.cmp_content(&node2));

Or we can do what is proposed in the PR and add:

let node1 = Node {
  value: "foo",
  span: span!(0, 4),
};

let node2 = Node {
  value: "foo",
  span: span!(5, 11),
};
assert_eq!(node1, node2);

assert_ne!(node1.span, node2.span);
assert_eq!(node1.cmp_span(&node2));

I'm not sure what is the most common approach to AST comparisons with spans. I'd suggest checking prior art in other parser/AST/serializer models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants