Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
db1a5a5
Make lark.lark parse the same grammar as load_grammar.py, and make gr…
RossPatterson Feb 1, 2024
9493f81
1. Fix "Python type check / Format (pull request)" failure in test_la…
RossPatterson Feb 1, 2024
7a2880f
DOH!
RossPatterson Feb 1, 2024
83a374f
Remove unnessary anchor; coalesce ENBF item sets; fix %override grammar
RossPatterson Feb 2, 2024
fdffb5f
Revert lark.lark to its original form.
RossPatterson Feb 9, 2024
95c5742
Make lark.lark accept the same input as load_grammar.py, and provide …
RossPatterson Feb 9, 2024
200d6b5
Address some review comments.
RossPatterson Feb 9, 2024
0fb28f9
Fix review comment re: templates in terminals.
RossPatterson Feb 10, 2024
2ec5ef3
Fix review comment: Remove inlining from expansions, expansion, and v…
RossPatterson Feb 10, 2024
e9c026e
Address review comment: Make alias and expr optionals, not maybes, so…
RossPatterson Feb 10, 2024
9bf7ddf
Address review comment: Make '%declare rule' fail in post-processing …
RossPatterson Feb 10, 2024
7f02bd1
lark.lark doesn't allow backslash-nl as a line-continuation, but load…
RossPatterson Feb 13, 2024
4f7a5eb
Push optionality of rule_modifiers and priority down into rule_modifi…
RossPatterson Mar 15, 2024
40576d2
Fix bug introduced in #1018
RossPatterson Mar 15, 2024
daac65d
Issue #1388 is ready for review.
RossPatterson Mar 15, 2024
5f37365
Resolve @megalng comment re:@skipIf
RossPatterson Jun 21, 2024
697841b
Resolve @megalng comment re:tests/test_lark_validator.py
RossPatterson Jun 21, 2024
654e102
Resolve @megalng comment re:docstrings
RossPatterson Jun 21, 2024
33d7088
Resolve @erezsh comment re:typo
RossPatterson Jun 21, 2024
0d01fe2
Resolve part of @erezsh comment re: options.
RossPatterson Jun 21, 2024
20302ca
Remove obsolete 'options' parameter
RossPatterson Sep 24, 2024
ff01d96
Separate out tests that cannot be run with lark.lark.
RossPatterson Jan 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 51 additions & 15 deletions docs/grammar.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,26 +59,37 @@ Terminals are used to match text into symbols. They can be defined as a combinat
**Syntax:**

```html
<NAME> [. <priority>] : <literals-and-or-terminals>
<NAME> [. <priority>] : <items-to-match>
```

Terminal names must be uppercase.
Terminal names must be uppercase. They must start with an underscore (`_`) or a letter (`A` through `Z`), and may be composed of letters, underscores, and digits (`0` through `9`). Terminal names that start with "_" will not be included in the parse tree, unless the `keep_all_tokens` option is specified.

Literals can be one of:

* `"string"`
* `/regular expression+/`
* `"case-insensitive string"i`
* `/re with flags/imulx`
* Literal range: `"a".."z"`, `"1".."9"`, etc.
* Literal range: `"a".."z"`, `"1".."9"`, etc. - Each literal must be a single character, and the range represends all values between the two literals, inclusively.

Terminals also support grammar operators, such as `|`, `+`, `*` and `?`.
Each item is one of:

* `TERMINAL` - Another terminal, which cannot be defined in terms of this terminal.
* `"string literal"` - Literal, to be matched as-is.
* `"string literal"i` - Literal, to be matched case-insensitively.
* `/regexp literal/` - Regular expression literal. Can inclde flags.
* `"character".."character"` - Literal range. The range represends all values between the two literals, inclusively.
* `(item item ..)` - Group items
* `(item | item | ..)` - Alternate items.
* `[item item ..]` - Maybe. Same as `(item item ..)?`, but when `maybe_placeholders=True`, generates `None` if there is no match.
* `[item | item | ..]` - Maybe with alternates. Same as `(item | item | ..)?`, but when `maybe_placeholders=True`, generates `None` if there is no match.
* `item?` - Zero or one instances of item (a "maybe")
* `item*` - Zero or more instances of item
* `item+` - One or more instances of item
* `item ~ n` - Exactly *n* instances of item
* `item ~ n..m` - Between *n* to *m* instances of item (not recommended for wide ranges, due to performance issues)

Terminals are a linear construct, and therefore may not contain themselves (recursion isn't allowed).

### Templates

Templates are expanded when preprocessing the grammar.
Templates are expanded when preprocessing rules in the grammar. Templates are not allowed with terminals.

Definition syntax:

Expand Down Expand Up @@ -122,7 +133,7 @@ SIGNED_INTEGER: /
/x
```

Supported flags are one of: `imslux`. See Python's regex documentation for more details on each one.
Supported flags are one of: `imslux`. See Python's [regex documentation](https://docs.python.org/3/library/re.html#regular-expression-syntax) for more details on each one.

Regexps/strings of different flags can only be concatenated in Python 3.6+

Expand Down Expand Up @@ -196,25 +207,32 @@ _ambig

**Syntax:**
```html
<name> : <items-to-match> [-> <alias> ]
<modifiers><name> : <items-to-match> [-> <alias> ]
| ...
```

Names of rules and aliases are always in lowercase.
Names of rules and aliases are always in lowercase. They must start with an underscore (`_`) or a letter (`a` through `z`), and may be composed of letters, underscores, and digits (`0` through `9`). Rule names that start with "_" will be inlined into their containing rule.

Rule definitions can be extended to the next line by using the OR operator (signified by a pipe: `|` ).

An alias is a name for the specific rule alternative. It affects tree construction.
An alias is a name for the specific rule alternative. It affects tree construction (see [Shaping the tree](tree_construction#shaping_the_tree).

The affect of a rule on the parse tree can be specified by modifiers. The `!` modifier causes the rule to keep all its tokens, regardless of whether they are named or not. The `?` modifier causes the rule to be inlined if it only has a single child. The `?` modifier cannot be used on rules that are named starting with an underscore.

Each item is one of:

* `rule`
* `TERMINAL`
* `"string literal"` or `/regexp literal/`
* `"string literal"` - Literal, to be matched as-is.
* `"string literal"i` - Literal, to be matched case-insensitively.
* `/regexp literal/` - Regular expression literal. Can inclde flags.
* `"character".."character"` - Literal range. The range represends all values between the two literals, inclusively.
* template(parameter1, parameter2, ..) - A template to be expanded with the specified parameters.
* `(item item ..)` - Group items
* `(item | item | ..)` - Alternate items. Note that the items cannot have aliases.
* `[item item ..]` - Maybe. Same as `(item item ..)?`, but when `maybe_placeholders=True`, generates `None` if there is no match.
* `item?` - Zero or one instances of item ("maybe")
* `[item | item | ..]` - Maybe with alternates. Same as `(item | item | ..)?`, but when `maybe_placeholders=True`, generates `None` if there is no match. Note that the items cannot have aliases.
* `item?` - Zero or one instances of item (a "maybe")
* `item*` - Zero or more instances of item
* `item+` - One or more instances of item
* `item ~ n` - Exactly *n* instances of item
Expand Down Expand Up @@ -297,12 +315,24 @@ Note that `%ignore` directives cannot be imported. Imported rules will abide by

Declare a terminal without defining it. Useful for plugins.

**Syntax:**
```html
%declare <TERMINAL>
%declare <rule>
```

### %override

Override a rule or terminals, affecting all references to it, even in imported grammars.

Useful for implementing an inheritance pattern when importing grammars.

**Syntax:**
```html
%override <TERMINAL> ... terminal definition ...
%override <rule> ... rule definition ...
```

**Example:**
```perl
%import my_grammar (start, number, NUMBER)
Expand All @@ -319,6 +349,12 @@ Useful for splitting up a definition of a complex rule with many different optio

Can also be used to implement a plugin system where a core grammar is extended by others.

**Syntax:**
```html
%extend <TERMINAL> ... additional terminal alternate ...
%extend <rule> ... additional rule alternate ...
```


**Example:**
```perl
Expand Down
1 change: 1 addition & 0 deletions docs/tree_construction.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ Lark will parse "((hello world))" as:
The brackets do not appear in the tree by design. The words appear because they are matched by a named terminal.


<a name="shaping_the_tree"></a>
## Shaping the tree

Users can alter the automatic construction of the tree using a collection of grammar features.
Expand Down
52 changes: 36 additions & 16 deletions lark/grammars/lark.lark
Original file line number Diff line number Diff line change
Expand Up @@ -7,46 +7,66 @@ _item: rule
| token
| statement

rule: RULE rule_params priority? ":" expansions
token: TOKEN token_params priority? ":" expansions
rule: RULE_MODIFIERS? RULE rule_params priority? ":" rule_expansions
token: TOKEN priority? ":" token_expansions

rule_params: ["{" RULE ("," RULE)* "}"]
token_params: ["{" TOKEN ("," TOKEN)* "}"]

priority: "." NUMBER

statement: "%ignore" expansions -> ignore
statement: "%ignore" ignore_token -> ignore
| "%import" import_path ["->" name] -> import
| "%import" import_path name_list -> multi_import
| "%override" rule -> override_rule
| "%override" token -> override_token
| "%declare" name+ -> declare
| "%extend" rule -> extend_rule
| "%extend" token -> extend_token

ignore_token: ignore_item [ OP | "~" NUMBER [".." NUMBER]]
ignore_item: STRING | TOKEN | REGEXP

!import_path: "."? name ("." name)*
name_list: "(" name ("," name)* ")"

?expansions: alias (_VBAR alias)*
?rule_expansions: rule_alias (_VBAR rule_alias)*

?rule_inner_expansions: rule_expansion (_VBAR rule_expansion)*

?rule_alias: rule_expansion ["->" RULE]

?rule_expansion: rule_expr*

?rule_expr: rule_atom [OP | "~" NUMBER [".." NUMBER]]
?rule_atom: "(" rule_inner_expansions ")"
| "[" rule_inner_expansions "]" -> rule_maybe
| rule_value

?rule_value: RULE "{" rule_value ("," rule_value)* "}" -> rule_template_usage
| RULE
| token_value

?alias: expansion ["->" RULE]
?token_expansions: token_expansion (_VBAR token_expansion)*

?expansion: expr*
?token_expansion: token_expr*

?expr: atom [OP | "~" NUMBER [".." NUMBER]]
?token_expr: token_atom [OP | "~" NUMBER [".." NUMBER]]

?atom: "(" expansions ")"
| "[" expansions "]" -> maybe
| value
?token_atom: "(" token_expansions ")"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

token_atom and rule_atom are the same, why not merge them using templates? Same with rule_expr and atom_expr, rule_expansion etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the original lark.lark can derive atom -> value -> template_usage, but templates aren't allowed for terminals (pending resolution of Issue #555). I didn't see any way, with templates, to prevent that derivation. As far as I can tell, templates are effectively a text-substitution macro capability (although I understand the implementation is a little more sophisticated).

I suppose I could have done something like this:

_expansion{expr_type}: expr_type*
_expr{atom_type}: atom_type [OP | "~" NUMBER [".." NUMBER]]
_atom{expansion_type, value_type}: "(" expansion_type ")"
                                 | "[" expansion_type "]" -> maybe
                                 | value_type

?rule_expansions: rule_alias (_VBAR rule_alias)*
?rule_inner_expansions: rule_expansion (_VBAR rule_expansion)*
?rule_alias: rule_expansion ["->" RULE]
?rule_expansion: _expansion(rule_expr)
?rule_expr: _expr(rule_atom)
?rule_atom: _atom(rule_inner_expansions, rule_value)
?rule_value: RULE "{" rule_value ("," rule_value)* "}" -> rule_template_usage
           | RULE
           | value

?token_expansions: token_expansion (_VBAR token_expansion)*
?token_expansion: _expansion(token_expr)
?token_expr: _expr(token_atom)
?token_atom: _atom(token_expansions, token_value)

?value: STRING ".." STRING -> literal_range
      | TOKEN
      | (REGEXP | STRING) -> literal

Is it what you're suggesting? It doesn't read as cleanly to my eyes, but I'll grant that I've been studying Lark pretty hard lately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we dig into that, let's come to an agreement in the main conversation (regarding restrictive-validating grammar vs relaxed grammar + validating visitor).

Afterwards if it's still relevant, I will try to craft a solution to this problem.

| "[" token_expansions "]" -> token_maybe
| token_value

?value: STRING ".." STRING -> literal_range
| name
| (REGEXP | STRING) -> literal
| name "{" value ("," value)* "}" -> template_usage
?token_value: STRING ".." STRING -> literal_range
| TOKEN
| (REGEXP | STRING) -> literal

name: RULE
| TOKEN

_VBAR: _NL? "|"
OP: /[+*]|[?](?![a-z])/
RULE: /!?[_?]?[a-z][_a-z0-9]*/
RULE: /_?[a-z][_a-z0-9]*/
RULE_MODIFIERS: /!|![?](?=[a-z])|[?]!?(?=[a-z])/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need the ?=s here? I don't recall that ? or ! are valid anywhere else in the grammar..?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I originally had a comment explaining that, but it would have been the only part of the entire grammar that had commentary, so I removed it. The ? rule modifier is not allowed on a rule whose name begins with _ (https://github.com/lark-parser/lark/blob/master/lark/load_grammar.py#L1057-L1059), and that bit or regex blocks that usage. So maybe I should put the comment back in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. But wouldn't users be able to bypass it by adding whitespace between them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drat! I missed a test case. But no, this test fails similarly in load_grammar.py, the original lark.lark, and my modified version:

            rule1: rule2
            ?! rule2: "a"
        """

load_grammar.py:

Traceback (most recent call last):
  File "C:\Ross\Source\lark\lark\parsers\lalr_parser_state.py", line 77, in feed_token
    action, arg = states[state][token.type]
KeyError: 'OP'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Ross\Source\lark\tests\test_lark_lark.py", line 169, in test_11_rule_modifier_query_bang_space_lg
    Lark(g)
...
  File "C:\Ross\Source\lark\lark\lexer.py", line 674, in lex
    raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)
lark.exceptions.UnexpectedToken: Unexpected token Token('OP', '?') at line 3, column 13.
Expected one of:
        * "%declare"
        * "%import"
        * "%extend"
        * RULE
        * "%override"
        * _NL
        * $END
        * "%ignore"
        * TOKEN
        * RULE_MODIFIERS
Previous tokens: [Token('_NL', '\n            ')]

Original lark.lark:

Traceback (most recent call last):
  File "C:\Ross\Source\lark\lark\lexer.py", line 665, in lex
    yield lexer.next_token(lexer_state, parser_state)
  File "C:\Ross\Source\lark\lark\lexer.py", line 598, in next_token
    raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
lark.exceptions.UnexpectedCharacters: No terminal matches '?' in the current parser context, at line 3 col 13

            ?! rule2: "a"
            ^
...
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Ross\Source\lark\tests\test_lark_lark.py", line 176, in test_11_rule_modifier_query_bang_space_ll
    self.lark_parser.parse(g)
...
  File "C:\Ross\Source\lark\lark\lexer.py", line 674, in lex
    raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)
lark.exceptions.UnexpectedToken: Unexpected token Token('OP', '?') at line 3, column 13.
Expected one of:
        * _NL
        * "%import"
        * "%ignore"
        * "%override"
        * TOKEN
        * $END
        * "%declare"
        * RULE
Previous tokens: [Token('_NL', '\n            ')]

Modified lark.lark:

Traceback (most recent call last):
  File "C:\Ross\Source\lark\lark\lexer.py", line 665, in lex
    yield lexer.next_token(lexer_state, parser_state)
  File "C:\Ross\Source\lark\lark\lexer.py", line 598, in next_token
    raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
lark.exceptions.UnexpectedCharacters: No terminal matches '?' in the current parser context, at line 3 col 13

            ?! rule2: "a"
            ^
...
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Ross\Source\lark\tests\test_lark_lark.py", line 176, in test_11_rule_modifier_query_bang_space_ll
    self.lark_parser.parse(g)
...
  File "C:\Ross\Source\lark\lark\lexer.py", line 674, in lex
    raise UnexpectedToken(token, e.allowed, state=parser_state, token_history=[last_token], terminals_by_name=self.root_lexer.terminals_by_name)
lark.exceptions.UnexpectedToken: Unexpected token Token('OP', '?') at line 3, column 13.
Expected one of:
        * "%declare"
        * RULE
        * RULE_MODIFIERS
        * $END
        * "%ignore"
        * "%extend"
        * "%import"
        * TOKEN
        * _NL
        * "%override"
Previous tokens: [Token('_NL', '\n            ')]

TOKEN: /_?[A-Z][_A-Z0-9]*/
STRING: _STRING "i"?
REGEXP: /\/(?!\/)(\\\/|\\\\|[^\/])*?\/[imslux]*/
Expand Down
169 changes: 169 additions & 0 deletions tests/test_grammar_formal.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
from __future__ import absolute_import

import os
from unittest import TestCase, main

from lark import lark, Lark, UnexpectedToken
from lark.load_grammar import GrammarError


# Based on TestGrammar, with lots of tests that can't be run elided.
class TestGrammarFormal(TestCase):
def setUp(self):
lark_path = os.path.join(os.path.dirname(lark.__file__), 'grammars/lark.lark')
# lark_path = os.path.join(os.path.dirname(lark.__file__), 'grammars/lark.lark-ORIG')
with open(lark_path, 'r') as f:
self.lark_grammar = "\n".join(f.readlines())

def test_errors(self):
# raise NotImplementedError("Doesn't work yet.")
l = Lark(self.lark_grammar, parser="lalr")

# This is an unrolled form of the test_grammar.py:GRAMMAR_ERRORS tests, because the lark.lark messages vary.

# 'Incorrect type of value', 'a: 1\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..NUMBER., .1..', l.parse, 'a: 1\n')
# 'Unclosed parenthesis', 'a: (\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token.._NL.,', l.parse, 'a: (\n')
# 'Unmatched closing parenthesis', 'a: )\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..RPAR.', l.parse, 'a: )\n')
# 'Unmatched closing parenthesis', 'a: )\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..RPAR.,', l.parse, 'a: )\n')
# 'Unmatched closing parenthesis', 'a: (\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token.._NL.,', l.parse, 'a: (\n')
# 'Expecting rule or terminal definition (missing colon)', 'a\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token.._NL.,', l.parse, 'a\n')
# 'Expecting rule or terminal definition (missing colon)', 'A\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token.._NL.,', l.parse, 'A\n')
# 'Expecting rule or terminal definition (missing colon)', 'a->\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..__ANON_0., .->', l.parse, 'a->\n')
# 'Expecting rule or terminal definition (missing colon)', 'A->\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..__ANON_0., .->', l.parse, 'A->\n')
# 'Expecting rule or terminal definition (missing colon)', 'a A\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..TOKEN., .A..', l.parse, 'a A\n')
# 'Illegal name for rules or terminals', 'Aa:\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..RULE., .a..', l.parse, 'Aa:\n')
# 'Alias expects lowercase name', 'a: -> "a"\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..STRING., ."a"..', l.parse, 'a: -> "a"\n')
# 'Unexpected colon', 'a::\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..COLON.,', l.parse, 'a::\n')
# 'Unexpected colon', 'a: b:\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..COLON.,', l.parse, 'a: b:\n')
# 'Unexpected colon', 'a: B:\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..COLON.,', l.parse, 'a: B:\n')
# 'Unexpected colon', 'a: "a":\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..COLON.,', l.parse, 'a: "a":\n')
# 'Misplaced operator', 'a: b??'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..OP., .\?..', l.parse, 'a: b??')
# 'Misplaced operator', 'a: b(?)'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..OP., .\?..', l.parse, 'a: b(?)')
# 'Misplaced operator', 'a:+\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..OP., .\+..', l.parse, 'a:+\n')
# 'Misplaced operator', 'a:?\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..OP., .\?..', l.parse, 'a:?\n')
# 'Misplaced operator', 'a:*\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..OP., .\*..', l.parse, 'a:*\n')
# 'Misplaced operator', 'a:|*\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..OP., .\*..', l.parse, 'a:|*\n')
# 'Expecting option ("|") or a new rule or terminal definition', 'a:a\n()\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..LPAR.,', l.parse, 'a:a\n()\n')
# 'Terminal names cannot contain dots', 'A.B\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..TOKEN., .B..', l.parse, 'A.B\n')
# 'Expecting rule or terminal definition', '"a"\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..STRING., ."a"..', l.parse, '"a"\n')
# '%import expects a name', '%import "a"\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..STRING., ."a"..', l.parse, '%import "a"\n')
# '%ignore expects a value', '%ignore %import\n'
self.assertRaisesRegex(UnexpectedToken, 'Unexpected token Token..__ANON_2., .%import..', l.parse, '%ignore %import\n')

# def test_empty_literal(self):
# raise NotImplementedError("Breaks tests/test_parser.py:_TestParser:test_backslash2().")

# def test_ignore_name(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_override_rule_1(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_override_rule_2(self):
# raise NotImplementedError("Can't test semantics of grammar, only syntax.")

# def test_override_rule_3(self):
# raise NotImplementedError("Can't test semantics of grammar, only syntax.")

# def test_override_terminal(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_extend_rule_1(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_extend_rule_2(self):
# raise NotImplementedError("Can't test semantics of grammar, only syntax.")

# def test_extend_term(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_extend_twice(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_undefined_ignore(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

def test_alias_in_terminal(self):
l = Lark(self.lark_grammar, parser="lalr")
g = """start: TERM
TERM: "a" -> alias
"""
# self.assertRaisesRegex( GrammarError, "Aliasing not allowed in terminals", Lark, g)
self.assertRaisesRegex( UnexpectedToken, "Unexpected token Token.'__ANON_0', '->'.", l.parse, g)

# def test_undefined_rule(self):
# raise NotImplementedError("Can't test semantics of grammar, only syntax.")

# def test_undefined_term(self):
# raise NotImplementedError("Can't test semantics of grammar, only syntax.")

# def test_token_multiline_only_works_with_x_flag(self):
# raise NotImplementedError("Can't test regex flags in Lark grammar.")

# def test_import_custom_sources(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_import_custom_sources2(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_import_custom_sources3(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_my_find_grammar_errors(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_ranged_repeat_terms(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_ranged_repeat_large(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_large_terminal(self):
# raise NotImplementedError("Can't parse using parsed grammar.")

# def test_list_grammar_imports(self):
# raise NotImplementedError("Can't test semantics of grammar, only syntax.")

def test_inline_with_expand_single(self):
l = Lark(self.lark_grammar, parser="lalr")
grammar = r"""
start: _a
!?_a: "A"
"""
# self.assertRaisesRegex(GrammarError, "Inlined rules (_rule) cannot use the ?rule modifier.", l.parse, grammar)
# TODO Is this really catching the right problem?
self.assertRaisesRegex(UnexpectedToken, "Unexpected token Token.'OP', '?'.", l.parse, grammar)


# def test_line_breaks(self):
# raise NotImplementedError("Can't parse using parsed grammar.")


if __name__ == '__main__':
main()
Loading