-
Notifications
You must be signed in to change notification settings - Fork 4
Encoding MEI as JSON
The reference standard for encoding MEI is as XML, but it may also be encoded as any number of other markup formats. The MEI-JSON format represents MEI as JavaScript Object Notation, a format that is useful for native parsing in web applications. There are several major differences, however:
- JSON objects can have child objects, but they cannot have attributes;
- JSON objects cannot have "inline" tags;
- JSON objects can only be represented as key,value pairs, or as lists.
For MEI-JSON, attributes are given the key "@a", values, "@v", and tails, "@t". (A value is the text encoded between two tags, and a tail is the text encoded after a tag but before another closing tag, e.g.,
<bar>
<foo>this is a value</foo> and this is a tail.
</bar>
Functionally, this means that the following structures are equivalent:
{"date": [{"@a": {"xml:id": "17c6fe69-944e-4456-a160-e7edf3bad0ac" }},
{"@v": "2010-04-28"},
{"@t": "using an XSLT stylesheet (2mei v. 2.2.3)."}]}
<date xml:id="17c6fe69-944e-4456-a160-e7edf3bad0ac">2010-04-28</date> using an XSLT stylesheet (2mei v. 2.2.3).
In the above example, the JSON is verbose and, with the proliferation of square and curly braces, arguably less 'human-readable.' So why would we ever want to represent MEI this way?
There are two reasons. The first is that a JSON object parses into native Javascript code very easily, without having to run through a separate (and often bulky) XML parser, which usually entails validation. For web applications, this means that you can send and retrieve JSON-encoded objects very quickly and easily. Children of elements are encoded as child objects in the list, distinguished by not beginning with an @ sign, e.g.,
{"p": [{"@a": [{'foo':'bar'}], {"@v": "The date is"}], {"date": [{"@v":"2010-04-28."}]}
which is equivalent to the XML representation:
<p foo="bar">The date is <date>2010-04-28.</date></p>
Fragments of documents -- a note object, or a staff object with child objects -- may be sent from browser to server without sending an entire well-formed XML document.
The second reason is that it follows a well-known representation of data structures. JSON objects may be mapped almost directly onto native Python dictionaries and data types. For PyMEI, outputting JSON was simply a matter of converting an internal structure of Python data types to a JSON representation. Since this is a common platform for conversion, it is possible to go to other markup languages from this data structure, like YAML. This has not yet been tested, but may prove to be useful in the future.
Every element in PyMEI has a as_json()
method on it, as well as a as_xml_string()
(which returns an XML string representation), as_xml_object()
(which returns an etree Element object), and as_dictionary()
(which returns a plain Python dictionary object).