Description
Currently the parse_generictree
function can be used to return a Node
from there Node::pp
can convert a parse tree, into a textual representation as used by nimbleparse
.
One thing I'm slightly interested in is extending the output format to be more machine readable.
As such one thought is that we could include a function similar to Node::pp
that also takes a serde::Serializer
,
but instead of returning a String
returns something from the serde data model.
Here are a couple questions:
- This can likely be done outside of the repo, in the same way that we can copy/paste the
pp
function implementation and build it outside the crate as part of a binary to customize the output. - It isn't clear to me what exactly the best translation into the serde data model would be.
- type signature/type of the return value?
- Reusing
Node
or not? - We likely would need to choose some specific
Serializer
impl for e.g. nimbleparse/test_files
- Then does it make sense to
In repo or out?
One reason to include this in the repo is that it could then be used by the test_files
, which would make it possible to introduce a known file extension which contains both parser input, as well as a serialized form of parser output.
That seems to me like the most direct argument for it's inclusion, but usage in tools like nimbleparse
would also likely be deciding on a concrete Serializer
type too. So perhaps it is worth experimenting with this outside the repo before making these kinds of decisions.
AST translation into serde data model
One way would be to just use serialize_map
with String
keys, recursively use serialize_map
for the values. This seems possible, because serialize_map
allows duplicate keys. However I somewhat fear that it seems likely that only a subset of serde serializers implementations will allow this duplicate keys.
So another option would be to use something more S-Expression
like, using serialize_seq
combined with serialize_tuple
In theory regarding the first question we could choose the map
approach still even if it only works for a subset of Serializer
implementations, if it is most natural for e.g. the chosen output format of nimbleparse
with the idea that it is still possible to use the tuple
/seq
approach outside.
Type signature
It's been a while since I've worked with the Serializer
trait directly, presumably we need to because
the signature of pp
doesn't stand alone in a way that we can throw Serialize
bounds,
but requires the source string, to turn indices into names at various points.
This would seem to leave us with something like the following signature.
pub fn serde_serialize<S: serde::Serializer>(&self, grm: &YaccGrammar<StorageT>, input: &str, serializer: S) -> ???
But it isn't entirely clear to me what it should return, as S::Ok
doesn't have any trait bounds like Write
etc...
looking specifically at serde_json::Serializer::into_inner
leads me to believe that this should return S
.
Reusing Node
Due to the complexity of the third question, including Serializer
and not having a good Self
type for a Serialize
impl,
perhaps it'd be better to just try and implement Serialize
on a newtype around Node
,
#[derive(Serialize)]
struct SerializableNode<'a>(Node<LexemeT, StorageT>, &'a str)
Or have a parse_serde
which returns a SerializableNode
?
Choice of format for nimbleparse/test_files
I personally think ron format is likely to the be one of the nicest formats for test_files
,
because it has rusts r#"raw string"#
syntax, which makes embedding arbitrary input text along with serialized AST
Just use format directly and skip serde?
Given all the issues and the need to bless serialization format impl
within binaries like nimbleparse, perhaps
it is worth skipping the serde data model, and just using the chosen file format directly such as pp_ron(&self, &str) -> String
In many ways it seems like we're introducing complexity of the generic serde data model, but in some ways due to the usage of binaries,
we might not benefit from it
Final thoughts
This is much longer, and probably require more thought and effort than I initially imagined it could.
But I really don't feel like I've tinkered with it enough (or at all really) to have formed any real opinions neither informed nor strong.