Skip to content

Conversation

wfdewith
Copy link

This PR does the following things:

  • Change the s field in astnodes.String to bytes, since Lua strings can contain invalid UTF-8, while Python 3 str is guaranteed to be UTF-8.
  • Custom unescape function instead of ast.literal_eval, which gives more flexibility when it comes to correctly interpreting escaped characters in Lua strings. For example, both the \u{<hex>} and \<digit> escape codes in Lua have different semantics compared to Python.
  • Update the lexer grammar to ensure the escape codes are valid.
  • Add a raw field to astnodes.String where the original string literal is stored. This is used to reproduce the string literal exactly as it was in the Lua output visitor. There were some bugs with this code that this PR fixes, for example: "\"" becomes """ after a parse -> print round-trip, which is not valid Lua.

Note that changing s to bytes is a breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant