RecursionError in generated Python parser

The test corpus contained a case where  function is called with many (over 255) arguments, each just numbers. This used to be a syntax error in 3.6 and before -- but with more recent versions of the bytecode compiler it works for 10s of 1000s of arguments.

However, the generated parser in Python cannot handle it! It raises a RecursionError for `A(0, 1, ..., k)` if k is 221 or higher. The traceback seems to be an endless repetition of this fragment:
```
  File "/Users/guido/pegen/data/python_parser.py", line 5512, in _tmp_108
    if (literal := self.expect(",")) and (c := self.args()):
  File "/Users/guido/pegen/pegen/parser.py", line 65, in memoize_wrapper
    tree = method(self, *args)
  File "/Users/guido/pegen/data/python_parser.py", line 3041, in args
    if (a := self.named_expression()) and (b := self._tmp_108(),):
  File "/Users/guido/pegen/pegen/parser.py", line 65, in memoize_wrapper
    tree = method(self, *args)
```
I think this is actually about this rule:
```
args[expr_ty]:
    [...]
    | a=named_expression b=[',' c=args { c }] {
        [...] }
```
The part `[',' c=args {c}]` corresponds to `_tmp_108`. This is a straightforward recursion (not left-).

I can't actually fathom that the C parser doesn't segfault on this for me; it just slows down. (In the C code, it's `_tmp_109`, but the structure is the same.) The stack must automatically grow (on Mac, at least). But even there it would be nice if we could replace this by a simple loop.

E.g. maybe we could make this work?
```
args[expr_ty]:
    | a=starred_expression b=[',' c=args { c }] {
        _Py_Call(_PyPegen_dummy_name(p),
                 (b) ? CHECK(_PyPegen_seq_insert_in_front(p, a, ((expr_ty) b)->v.Call.args))
                     : CHECK(_PyPegen_singleton_seq(p, a)),
                 (b) ? ((expr_ty) b)->v.Call.keywords : NULL,
                 EXTRA) }
    | a=kwargs { _Py_Call(_PyPegen_dummy_name(p),
                          CHECK_NULL_ALLOWED(_PyPegen_seq_extract_starred_exprs(p, a)),
                          CHECK_NULL_ALLOWED(_PyPegen_seq_delete_starred_exprs(p, a)),
                          EXTRA) }
    | ','.(named_expression !'=')+ { XXX }
```
I haven't figured out what should go into XXX yet, but it's probably going to be more efficient, since we don't build up the array of arguments by prepending one at a time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RecursionError in generated Python parser #245

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RecursionError in generated Python parser #245

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions