Skip to content

Conversation

@sumittlearnbay
Copy link

No description provided.

@sumittlearnbay
Copy link
Author

Add non-overlapping ANTLR grammar examples for runtime tests

This commit introduces a set of nine non-obvious, non-overlapping grammars
under runtime-testsuite/test/org/antlr/v4/test/runtime/antlr_grammars/
to demonstrate and validate diverse ANTLR 4 parsing features.

Included grammars:

  • Arithmetic.g4 — arithmetic expression parsing
  • BooleanExpr.g4 — boolean and logical expressions
  • CSVFlexible.g4 — flexible CSV handling with optional quotes
  • JSONMini.g4 — minimal JSON subset parser
  • MiniConfig.g4 — simple key-value configuration format
  • MiniMarkdown.g4 — lightweight markdown-like parser
  • MiniQuery.g4 — SQL-inspired query syntax
  • UnitExpr.g4 — unit-based mathematical expressions

All grammars are self-contained and compile successfully with ANTLR 4.
Removed obsolete GrammarCompilationTest.java to fix build issues

@sumittlearnbay
Copy link
Author

successful check

@kaby76
Copy link
Contributor

kaby76 commented Oct 12, 2025

For pedagogical purposes, your grammars should use EOF-terminated start rules. Since Antlr 4.7, the behavior of Antlr parsers has changed in how they parse at the point of an error: the parser backs up the input pointer to the last valid parse and reports success. For example, with Arithmetic.g4, the input 1+2 3 parses. (The parse tree is (expr (expr (INT "1")) (T__3 "+") (expr (INT "2"))).) This is not what most people expect, and it has resulted in dozens of issues in grammars-v4, other Github projects (e.g., Spice), and most recently in this repo (#4890).

Also, I don't think your BooleanExpr grammar is a good example. Typically, the NOT operator has higher precedence than AND or OR, and textbooks on Boolean algebra follow this tradition. See https://bob.cs.sonoma.edu/IntroCompOrg-RPi/sec-balgebra.html#tb-boolprec. So, for input NOT TRUE OR FALSE, the parse tree should be (expr (expr (NOT "NOT") (expr (BOOL "TRUE"))) (OR "OR") (expr (BOOL "FALSE"))), not (expr (NOT "NOT") (expr (expr (BOOL "TRUE")) (OR "OR") (expr (BOOL "FALSE")))), which is what your grammar produces.

@sumittlearnbay
Copy link
Author

Add EOF-terminated fixed grammars following ANTLR 4.7+ and Boolean precedence standards

  • All grammars now terminate start rules with EOF for full input coverage, ensuring correct behavior under ANTLR 4.7+ (resolves partial parse acceptance issue).
  • BooleanExpr.g4 updated to follow textbook operator precedence:
    NOT > AND > OR
    (as per Table 5.1.4 in Robert G. Plantz, Introduction to Computer Organization).
  • Other grammars (Arithmetic, JSONMini, CSVFlexible, etc.) adjusted for clarity, non-overlap, and pedagogical consistency.
  • Removed test scaffolds for simplified inclusion in runtime-testsuite.

Signed-off-by: Sumit Pawar [email protected]

1 similar comment
@sumittlearnbay
Copy link
Author

Add EOF-terminated fixed grammars following ANTLR 4.7+ and Boolean precedence standards

  • All grammars now terminate start rules with EOF for full input coverage, ensuring correct behavior under ANTLR 4.7+ (resolves partial parse acceptance issue).
  • BooleanExpr.g4 updated to follow textbook operator precedence:
    NOT > AND > OR
    (as per Table 5.1.4 in Robert G. Plantz, Introduction to Computer Organization).
  • Other grammars (Arithmetic, JSONMini, CSVFlexible, etc.) adjusted for clarity, non-overlap, and pedagogical consistency.
  • Removed test scaffolds for simplified inclusion in runtime-testsuite.

Signed-off-by: Sumit Pawar [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants