Skip to content

Serde support for ParseGenericTree #589

Open
@ratmice

Description

@ratmice

Currently the parse_generictree function can be used to return a Node from there Node::pp can convert a parse tree, into a textual representation as used by nimbleparse.

One thing I'm slightly interested in is extending the output format to be more machine readable.
As such one thought is that we could include a function similar to Node::pp that also takes a serde::Serializer,
but instead of returning a String returns something from the serde data model.

Here are a couple questions:

  1. This can likely be done outside of the repo, in the same way that we can copy/paste the pp function implementation and build it outside the crate as part of a binary to customize the output.
  2. It isn't clear to me what exactly the best translation into the serde data model would be.
  3. type signature/type of the return value?
  4. Reusing Node or not?
  5. We likely would need to choose some specific Serializer impl for e.g. nimbleparse/test_files
  6. Then does it make sense to
In repo or out?

One reason to include this in the repo is that it could then be used by the test_files, which would make it possible to introduce a known file extension which contains both parser input, as well as a serialized form of parser output.

That seems to me like the most direct argument for it's inclusion, but usage in tools like nimbleparse would also likely be deciding on a concrete Serializer type too. So perhaps it is worth experimenting with this outside the repo before making these kinds of decisions.

AST translation into serde data model

One way would be to just use serialize_map with String keys, recursively use serialize_map for the values. This seems possible, because serialize_map allows duplicate keys. However I somewhat fear that it seems likely that only a subset of serde serializers implementations will allow this duplicate keys.

So another option would be to use something more S-Expression like, using serialize_seq combined with serialize_tuple

In theory regarding the first question we could choose the map approach still even if it only works for a subset of Serializer implementations, if it is most natural for e.g. the chosen output format of nimbleparse with the idea that it is still possible to use the tuple/seq approach outside.

Type signature

It's been a while since I've worked with the Serializer trait directly, presumably we need to because
the signature of pp doesn't stand alone in a way that we can throw Serialize bounds,
but requires the source string, to turn indices into names at various points.
This would seem to leave us with something like the following signature.

pub fn serde_serialize<S: serde::Serializer>(&self, grm: &YaccGrammar<StorageT>, input: &str, serializer: S) -> ???

But it isn't entirely clear to me what it should return, as S::Ok doesn't have any trait bounds like Write etc...
looking specifically at serde_json::Serializer::into_inner leads me to believe that this should return S.

Reusing Node

Due to the complexity of the third question, including Serializer and not having a good Self type for a Serialize impl,
perhaps it'd be better to just try and implement Serialize on a newtype around Node,

#[derive(Serialize)]
struct SerializableNode<'a>(Node<LexemeT, StorageT>, &'a str)

Or have a parse_serde which returns a SerializableNode?

Choice of format for nimbleparse/test_files

I personally think ron format is likely to the be one of the nicest formats for test_files,
because it has rusts r#"raw string"# syntax, which makes embedding arbitrary input text along with serialized AST

Just use format directly and skip serde?

Given all the issues and the need to bless serialization format impl within binaries like nimbleparse, perhaps
it is worth skipping the serde data model, and just using the chosen file format directly such as pp_ron(&self, &str) -> String

In many ways it seems like we're introducing complexity of the generic serde data model, but in some ways due to the usage of binaries,
we might not benefit from it

Final thoughts

This is much longer, and probably require more thought and effort than I initially imagined it could.
But I really don't feel like I've tinkered with it enough (or at all really) to have formed any real opinions neither informed nor strong.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions