Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistmatch on (A|B) when A is a prefix of B #12

Open
sgbeal opened this issue Apr 12, 2014 · 4 comments
Open

Mistmatch on (A|B) when A is a prefix of B #12

sgbeal opened this issue Apr 12, 2014 · 4 comments

Comments

@sgbeal
Copy link

sgbeal commented Apr 12, 2014

Hiho,

i'm not sure if this qualifies as a bug, a corner case, or a known limitation:

regex: (match|matchAll)
input: match matchAll

when looping, that will match twice with these strings: ["match", "match"]

If the regex order is swapped, so that A is not a prefix of B (matchAll|match), then the result is as expected: ["match", "matchAll"].

Maybe a note in the docs suffices to cover this (maybe there is already one i overlooked).

@eush77
Copy link

eush77 commented Jun 5, 2015

From pcre specification (emphasis mine):

Vertical bar characters are used to separate alternative patterns. For example, the pattern

gilbert|sullivan

matches either "gilbert" or "sullivan". Any number of alternatives may appear, and an empty alternative is permitted (matching the empty string). The matching process tries each alternative in turn, from left to right, and the first one that succeeds is used. If the alternatives are within a subpattern (defined below), "succeeds" means matching the rest of the main pattern as well as the alternative in the subpattern.

I checked RegExps in Firefox, Chrome, CPython, and Ack, and they work the same way. Grep did find the longest, though. But overall I think this is the expected behavior.

@niwred
Copy link

niwred commented Jan 20, 2017

A little variation on this:

IMHO the pattern "^(y|yes|YES|Yes|true|TRUE|True)$" should match string "yes" but it does not. If the pattern is rearranged "^(yes|y|YES|Yes|true|TRUE|True)$" then it matches. I quickly checked with Regex Buddy and none of the supported regex flavours show this behaviour.

@cpq
Copy link
Member

cpq commented Jan 31, 2017

I am happy to accept a PR with the associated unit test :)

@niwred
Copy link

niwred commented Jan 31, 2017

Not a big issue in my project, since I have control over the regexes. Just thought I let you know. If a project uses SLRE for user supplied regexes the result might be unexpected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@sgbeal @cpq @eush77 @niwred and others