Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Non-recursive shunting yard algorithm for expression parsing #524

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

c42f
Copy link
Member

@c42f c42f commented Jan 3, 2025

This is very much a work in progress, but shows some promise for very greatly reducing our recursion depth. The idea is to use the non recursive shunting-yard algorithm for parsing operators and grouping-parentheses but to delegate back to the existing recursive formulation for other constructs.

This will likely solve #368 in all practical cases - I expect deeply recursive constructs only for huge chains of operators and parentheses.

Currently our operator parsing consumes maybe 15 or so stack frames every time a grouping parenthesis is nested in combination with arithmetic. This quickly leads to absurdly deep program stacks and stack overflow. Moving to a system like a Pratt parser where we skip non-used precedence levels would make this a single stack frame. Moving to the shunting yard algorithm makes it zero stack frames, provided we can also use it to treat grouping parentheses (not an entirely simple thing, because parentheses in Julia are very syntactically overloaded.)

The biggest challenge here is to ensure we exactly reproduce all of Julia's operator precedence rules, which have many complicated special cases. The demo here doesn't cover many special cases, but it does show how a few of these can be dealt with quite simply in the non-recursive context. For example, chains of + and * need to parse into a single n-ary call, and it was reasonably easy to add this special case.

This is very much a work in progress, but shows some promise for *very
greatly* reducing our recursion depth. The idea is to use the non
recursive shunting-yard algorithm for parsing operators and
grouping-parentheses but to delegate back to the existing recursive
formulation for other constructs.

This will likely solve #368 in all practical cases - I expect deeply
recursive constructs only for huge chains of operators and parentheses.

Currently our operator parsing consumes maybe 15 or so stack frames
every time a grouping parenthesis nested in combination with arithmetic.
This quickly leads to absurdly deep program stacks and stack overflow.
Moving to a system like a Pratt parser where we skip non-used precedence
levels would make this a single stack frame. Moving to the shunting yard
algorithm makes it zero stack frames, provided we can also use it to
treat grouping parentheses (not an entirely simple thing, because
parentheses in Julia are *very* syntactically overloaded.)

The biggest challenge here is to ensure we exactly reproduce all of
Julia's operator precedence rules, which have many complicated special
cases. The demo here doesn't cover many special cases, but it does show
how a few of these can be dealt with quite simply in the non-recursive
context. For example, chains of `+` and `*` need to parse into a single
n-ary call, and it was reasonably easy to add this special case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant