Unicode outside BMP

iirc, PyShEx failed tests where the schema (or data?) had codepoints > U+FFFD . I stumbled across a [repo](https://github.com/ericprud/antlr-multiLang-template) that I created for dealing with this in Java and Javascript, both of which use UTF16 internally and thus require the grammar to be written not in terms of codepoints U+10000- but instead surrogate pairs. I don't remember the state of this repot, but it could be handy to clone it and play with the python rather than experimenting in the larger ShEx g4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unicode outside BMP #87

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unicode outside BMP #87

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions