Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization: Check for implicit anchor #165

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Andersama
Copy link
Contributor

Slight improvement on #95

From O'Reilly's book:

An engine with this optimization realizes that if a regex begins with .* or .+ and has no global alternation an implicit ^ can be prepended to the regex.

Intuition is that .* and .+ collide / consume all the initial characters that don't really match the pattern ergo .* and .+ will consume the entire input string in one go...bumping the initial start position is much like remaining in the .* or .+

@Andersama
Copy link
Contributor Author

Andersama commented Jan 6, 2021

Example: https://gcc.godbolt.org/z/3WvT98bfE

@Andersama
Copy link
Contributor Author

In testing without the implicit anchor searching for .*pattern I could not get the benchmark to complete (there's excessive amount of backtracking here). With this check however it seems to run roughly half as well as just searching for pattern directly. It's probably worth extracting the .* from the pattern and running search directly.

Improves code generation for cases where .* can be treated as an anchor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant